Module Code: ST1504 DELE
Name: Yadanar Aung
Admin No.: 2214621
Class: DAAA/FT/2B/07
CA1: Part A Convolutional Neural Network
Importing Libraries
import tensorflow as tf
# Check GPU is available
gpus = tf.config.experimental.list_physical_devices('GPU')
# Memory control: Prevent tensorflow from allocating totality of GPU memory
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
# Directory & File Handling
import os
import random
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.metrics import confusion_matrix
# Preprocessing Data
from tensorflow.keras.utils import image_dataset_from_directory
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras import layers
# Building CNN Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization
# Callbacks
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.callbacks import ModelCheckpoint
Define Task
↳ Implement an image classifier using a deep learning network.
- Given colour images of 224 by 224 pixels, containing 15 types of vegetables.
- Convert given images into grayscale (i.e. only 1 channel instead of 3).
- Build two types of neural networks, one for each input size:
- 31 by 31 pixels
- 128 by 128 pixels
- Compare & discuss classification accuracies for each input size.
1. Background Research
Vegetables are all around us. They are grown and havested in farms and then sold at market places for consumers to consume.
Main colour group of vegetables:
- Red
- Orange/Yellow
- White/Brown
- Green
- Blue/Purple
Some vegetables have distinct features making it easy to identify them, whereas others are harder to tell apart.
2. Data Understanding
- Import Datasets (train, validation & test)
- Exploratory Data Analysis
Dataset Folder: Dataset for CA1 part A
We are provided with a 'Dataset for CA1 part A' Dataset with 3 sub folders named 'train', 'validation', & 'test'.
- Given color images of 224 by 224 pixels.
- Contains 15 types of vegetables.
# Define train, validation & test dataset paths
train_path = 'Data/Dataset for CA1 part A/train'
validation_path = 'Data/Dataset for CA1 part A/validation'
test_path = 'Data/Dataset for CA1 part A/test'
Basic Information of Dataset
- Total of 15,028 colored images of size 224 by 224 pixels.
- Total of 3 Sub-datasets:
- Train: 9028 images (60%)
- Validation: 3000 images (20%)
- Test: 3000 images (20%)
- 15 Classes to Predict:
- 'Bean', 'Bitter_Gourd', 'Bottle_Gourd', 'Brinjal', 'Broccoli', 'Cabbage', 'Capsicum', 'Carrot', 'Cauliflower', 'Cucumber', 'Papaya', 'Potato', 'Pumpkin', 'Radish', 'Tomato'
Import Train, Validation & Test Datasets
Use tf.keras.utils.image_dataset_from_directory. This will make a BatchDataset object.
- label_mode = 'categorical': labels data as categorical
- image_size: indicate image size to be resized so that it matches the size that model will handle
Source(s):
1. Dataset with 3 colour channels & size of 224 by 224 pixels
# Load Dataset for CA1 part A dataset
print("Train Dataset:")
train_ds = image_dataset_from_directory(train_path, labels = 'inferred', label_mode = 'categorical', image_size=(224, 224), batch_size=32)
print("\nValidation Dataset:")
validation_ds = image_dataset_from_directory(validation_path, labels = 'inferred', label_mode = 'categorical', image_size=(224, 224), batch_size=32)
print("\nTest Dataset:")
test_ds = image_dataset_from_directory(test_path, labels = 'inferred', label_mode = 'categorical', image_size=(224, 224), batch_size=32)
Train Dataset: Found 9028 files belonging to 15 classes. Validation Dataset: Found 3000 files belonging to 15 classes. Test Dataset: Found 3000 files belonging to 15 classes.
2. Dataset with 1 colour channel & size of 128 by 128 pixels
# Grayscale Color & Resize to 128 by 128 pixels
print("Train Dataset:")
train_ds_128 = image_dataset_from_directory(train_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (128, 128), batch_size=32)
print("\nValidation Dataset:")
validation_ds_128 = image_dataset_from_directory(validation_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (128, 128), batch_size=32)
print("\nTest Dataset:")
test_ds_128 = image_dataset_from_directory(test_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (128, 128), batch_size=32)
Train Dataset: Found 9028 files belonging to 15 classes. Validation Dataset: Found 3000 files belonging to 15 classes. Test Dataset: Found 3000 files belonging to 15 classes.
3. Dataset with 1 color chanel & size of 31 by 31 pixels
# Grayscale Color & Rezize to 31 by 31 pixels
print("Train Dataset:")
train_ds_31 = image_dataset_from_directory(train_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (31, 31), batch_size=32)
print("\nValidation Dataset:")
validation_ds_31 = image_dataset_from_directory(validation_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (31, 31), batch_size=32)
print("\nTest Dataset:")
test_ds_31 = image_dataset_from_directory(test_path, labels = 'inferred', label_mode = 'categorical', color_mode = 'grayscale', image_size = (31, 31), batch_size=32)
Train Dataset: Found 9028 files belonging to 15 classes. Validation Dataset: Found 3000 files belonging to 15 classes. Test Dataset: Found 3000 files belonging to 15 classes.
Obtain Class Labels
# Class labels
class_names = train_ds.class_names
num_classes = len(class_names)
class_names
['Bean', 'Bitter_Gourd', 'Bottle_Gourd', 'Brinjal', 'Broccoli', 'Cabbage', 'Capsicum', 'Carrot', 'Cauliflower', 'Cucumber', 'Papaya', 'Potato', 'Pumpkin', 'Radish', 'Tomato']
Exploratory Data Analysis
- Check Class Balance of Train Dataset
- Visualise Images from Each 15 Classes
Check Class Balance of Train Dataset
# Counts the number of files in each subfolder in a directory
def count_files(dircetory):
class_counts = {} # Dictionary to store counts for each class
for dirpath, _, filenames in os.walk(dircetory):
# Exclude main dircetory
if dirpath != dircetory:
class_name = os.path.basename(dirpath)
file_count = sum(1 for file in filenames if os.path.isfile(os.path.join(dirpath, file)))
class_counts[class_name] = file_count
return class_counts
# Number of images per Class in Train Dataset
train_class_counts = count_files(train_path)
# Number of images per class in Validation Dataset
validation_class_counts = count_files(validation_path)
# Number of images per class in Test Dataset
test_class_counts = count_files(test_path)
num = {'Train': train_class_counts, 'Validation': validation_class_counts, 'Test': test_class_counts}
df = pd.DataFrame(data=num)
df.columns = [['Number of Images','Number of Images','Number of Images'], ['Train','Validation','Test']]
df
| Number of Images | |||
|---|---|---|---|
| Train | Validation | Test | |
| Bean | 780 | 200 | 200 |
| Bitter_Gourd | 720 | 200 | 200 |
| Bottle_Gourd | 441 | 200 | 200 |
| Brinjal | 868 | 200 | 200 |
| Broccoli | 750 | 200 | 200 |
| Cabbage | 503 | 200 | 200 |
| Capsicum | 351 | 200 | 200 |
| Carrot | 256 | 200 | 200 |
| Cauliflower | 587 | 200 | 200 |
| Cucumber | 812 | 200 | 200 |
| Papaya | 566 | 200 | 200 |
| Potato | 377 | 200 | 200 |
| Pumpkin | 814 | 200 | 200 |
| Radish | 248 | 200 | 200 |
| Tomato | 955 | 200 | 200 |
# Visualise Distribution of Image Classes
# Plot Bar Graph of image counts for each class
def plot_counts(class_counts):
sortedClasses = sorted(class_counts.keys(), key=lambda x: class_counts[x], reverse=True)
counts = [class_counts[class_name] for class_name in sortedClasses]
plt.figure(figsize=(10, 6))
plt.barh(sortedClasses, counts, color='blue') # Use barh for horizontal bars
plt.ylabel('Class')
plt.xlabel('Number of Images')
plt.title('Number of Images Per Class in Train Dataset')
plt.tight_layout() # Adjust layout for better display
plt.show()
# Plot Pie Chart of image counts for each class
def plot_pie_chart(class_counts):
sorted_classes = sorted(class_counts.keys(), key=lambda x: class_counts[x])
sizes = [class_counts[class_name] for class_name in sorted_classes]
plt.figure(figsize=(8, 8))
plt.pie(sizes, labels=sorted_classes, autopct='%1.1f%%', startangle=140)
plt.title('Percentage of Distribution of Image Classes in Train Dataset')
plt.show()
plot_counts(train_class_counts)
plot_pie_chart(train_class_counts)
Insights:
- Training dataset is heavily imbalanced.
Potential Setbacks:
- Overfitting
- Action needs to be taken to address this issue:
- Feature Engineering
- Data Augmentation
Visualise Train Dataset
Visualise First Image from Each Vegetable Class In Original Size & Colour
# Loop through each category and plot the first image from each
for i, category in enumerate(class_names):
# Get the path to the first image in the current category folder
category_path = os.path.join(train_path, category)
images_in_folder = os.listdir(category_path)
first_image_of_folder = images_in_folder[0]
first_image_path = os.path.join(category_path, first_image_of_folder)
# Load the image and convert it into a NumPy array, then normalize the values
img = load_img(first_image_path, target_size = (224, 224))
img_arr = img_to_array(img) / 255.0
# Create a subplot and plot the image with its title and no axis
# plt.subplot(rows, columns, index)
plt.subplot(3, 5, i + 1)
plt.imshow(img_arr)
plt.title(category)
plt.axis('off')
plt.suptitle(f'First Coloured Image Per Class with Image Size = {224, 224}', fontsize = 15)
plt.tight_layout()
plt.show()
Insights:
Shape of the vegetables are commonly round, oval, circular, long.
With the Color of vegetables, it is easy for humans to identify, however if we were to pass it to computer vision.
Visualise First Image from Each Vegetable Class In Grayscale
inputSizes = [(224, 224), (128, 128), (31, 31)]
for size in inputSizes:
# Plot 1st Image from each class
# Create figure & set size
plt.figure(figsize=(7, 7))
# Loop through each category and plot the first image from each
for i, category in enumerate(class_names):
# Get the path to the first image in the current category folder
category_path = os.path.join(train_path, category)
images_in_folder = os.listdir(category_path)
first_image_of_folder = images_in_folder[0]
first_image_path = os.path.join(category_path, first_image_of_folder)
# Load the image and convert it into a NumPy array, then normalize the values
img = load_img(first_image_path, target_size = size, color_mode='grayscale')
img_arr = img_to_array(img) / 255.0
# Create a subplot and plot the image with its title and no axis
# plt.subplot(rows, columns, index)
plt.subplot(3, 5, i + 1)
plt.imshow(img_arr, cmap='gray')
plt.title(category)
plt.axis('off')
plt.suptitle(f'First Grayscaled Image Per Class with Image Size = {size}', fontsize = 15)
plt.tight_layout()
plt.show()
Insights:
Colour information is lost.
Common shapes of vegetable can cause confusion.
Leads to ambiguity as the vegetable is open to more than one interpretation.
Visualise Random Ten Images from Each Vegetable Class
# Create figure & set size
plt.figure(figsize=(20, 30))
# Loop through each classLabel & plot ten random images from each
for i, classLabel in enumerate(class_names):
# Get path to all images in current classLabel folder
classLabel_path = os.path.join(train_path, classLabel)
images_in_folder = os.listdir(classLabel_path)
# Randomly select ten images from current classLabel
random_images = random.sample(images_in_folder, 10)
for j, image_name in enumerate(random_images):
# Get path to current image
image_path = os.path.join(classLabel_path, image_name)
# Load image & convert into a NumPy array
img = load_img(image_path, target_size = (224, 224), color_mode='grayscale')
img_arr = img_to_array(img) / 255.0 # Normalize values
# Create subplot & plot image with title & no axis
plt.subplot(num_classes, 10, i * 10 + j + 1)
plt.imshow(img_arr, cmap='gray')
plt.title(classLabel)
plt.axis('off')
plt.suptitle('Ten Random Images Per Vegetable Class', fontsize=20)
plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.show()
Insights:
- Orientation not consistent in images.
- Background of images varies. Some are taken against surrondings, others with a white backdrop.
- Quantity of vegetables vary across images.
- Sizes of vegetables also vary across images
- There're absences & presences of hand in the images.
- Some images are stretched.
- Different perspectives of the vegetables are shown in the images.
- E.g. Top View / Insides / Zoomed in
- In conclusion, dataset is not clean & is noisy.
- Based on random selection of images, appears to be no clear indication of wrongly classified vegetables from the dataset.
3. Data Pre-Processing
One-Hot Encoding
We have already set the class labels to one-hot encode, when we import our train dataset using image_dataset_from_directory
- label_mode='categorical': TensorFlow automatically sets labels to be represented using one-hot encoding
Nevertheless, I will be checking the one-hot encoded vectors to ensure that it has been done so.
# Iterate through train_ds & print one-hot encoded class labels
for images, labels in train_ds:
print("One-Hot Encoded Labels in First Batch Size of 32:")
print(labels)
break
One-Hot Encoded Labels in First Batch Size of 32: tf.Tensor( [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]], shape=(32, 15), dtype=float32)
Handle Class Imbalance
As the class labels are imbalanced, I will be investigating if Data Augmentation will be beneficial to the model.
Data Augmentation
This generate additional training data from existing examples by augmenting them using random transformations that yield believable-looking images. Hence, exposes model to more aspects of data to generalize better.
This prevents overfitting that might have occured due to small number of training examples.
Source:
https://www.tensorflow.org/tutorials/images/data_augmentation
https://keras.io/api/layers/preprocessing_layers/image_augmentation/
Create Preprocessing layer to perform Data Augmentation
# Function to perform data augmentation on train dataset
def dataAug(ds):
# Preprocessing layer for Data Augmentation
data_augmentation = tf.keras.Sequential([
layers.RandomFlip("horizontal_and_vertical"),
layers.RandomRotation(0.2),
layers.RandomZoom(0.2)
])
AUTOTUNE = tf.data.AUTOTUNE
ds = ds.map(lambda x, y: (data_augmentation(x, training=True), y),
num_parallel_calls=AUTOTUNE)
# Use buffered prefetching on all datasets.
return ds.prefetch(buffer_size=AUTOTUNE)
Peform Data Augmentation on Train Dataset
train_ds_128_dataAug = dataAug(train_ds_128)
train_ds_31_dataAug = dataAug(train_ds_31)
WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting ImageProjectiveTransformV3 cause there is no registered converter for this op.
Visualize Sample of Data Augmentated Images
# Take a few batches from the dataset
augmented_batch = next(iter(train_ds_128_dataAug))
# Extract images & labels
images, labels = augmented_batch
# Convert labels to integers
label_indices = tf.argmax(labels, axis=1).numpy()
# Visualize the images
plt.figure(figsize=(15, 8))
for i in range(5):
plt.subplot(1, 5, i + 1)
plt.imshow(images[i].numpy().astype("uint8"), cmap='gray')
plt.title(f"Label: {class_names[label_indices[i]]}")
plt.axis("off")
plt.show()
Combine the original train dataset & data augmentated train dataset.
Now, we have double the number of train images at 18,112.
# Concatenate data augmentation dataset with original dataset
train_ds_128_dataAug = train_ds_128_dataAug.concatenate(train_ds_128)
train_ds_31_dataAug = train_ds_31_dataAug.concatenate(train_ds_31)
Rescaling Input Pixel Values
A Neural Network learns its weights by continually adding gradient error vectors (multiplied by a learning rate) computed from backpropagation to various weights throughout the network as training samples are passed through.
Hence, it is important to normalize the inputs such that the ranges of distributions of feature values would not be different for each feature. Thus, not leading to the learning rate casuing corrections in each dimension that would differ proportionally from one another.
Normalizing the inputs is ideal as we do not want to find ourselves unable to center onto a better maxima in weights space state or in a slow moving to a better maxima state.
# Extract from dataset
for images, labels in train_ds_128_dataAug.take(1):
# Print statistics
print("Pixel Statistics for the original train dataset:")
print(f"Minimum pixel value: {images.numpy().min()}")
print(f"Maximum pixel value: {images.numpy().max()}")
print(f"Mean pixel value: {images.numpy().mean()}")
print(f"Standard deviation of pixel values: {images.numpy().std()}")
Pixel Statistics for the original train dataset: Minimum pixel value: 0.0 Maximum pixel value: 253.56655883789062 Mean pixel value: 113.79469299316406 Standard deviation of pixel values: 51.41456985473633
Currently, the input pixel sizes are in the range [0, 255].
Hence, I am going to rescale the input pixel sizes to make the pixel sizes on the same scale, so that the neural network can converge during the gradient descent better.
I will rescale all 3 datasets of train, test & validation in order to maintain consistency.
I will rescale both train datasets with and without data augmentation for comparison later on.
# Rescale inputs from [0 - 255] to [0 - 1]
# Input size: 128 by 128 pixels
train_ds_128_rescaled = train_ds_128.map(lambda x, y: (x / 255.0, y))
train_ds_128_dataAug_rescaled = train_ds_128_dataAug.map(lambda x, y: (x / 255.0, y))
validation_ds_128_rescaled = validation_ds_128.map(lambda x, y: (x / 255.0, y))
test_ds_128_rescaled = test_ds_128.map(lambda x, y: (x / 255.0, y))
# Input size: 31 by 31 pixels
train_ds_31_rescaled = train_ds_31.map(lambda x, y: (x / 255.0, y))
train_ds_31_dataAug_rescaled = train_ds_31_dataAug.map(lambda x, y: (x / 255.0, y))
validation_ds_31_rescaled = validation_ds_31.map(lambda x, y: (x / 255.0, y))
test_ds_31_rescaled = test_ds_31.map(lambda x, y: (x / 255.0, y))
# Extract a batch from the normalized dataset
for images, labels in train_ds_128_dataAug_rescaled.take(1):
# Print statistics
print("Pixel Statistics for the normalized train dataset:")
print(f"Minimum pixel value: {images.numpy().min()}")
print(f"Maximum pixel value: {images.numpy().max()}")
print(f"Mean pixel value: {images.numpy().mean()}")
print(f"Standard deviation of pixel values: {images.numpy().std()}")
Pixel Statistics for the normalized train dataset: Minimum pixel value: 0.0 Maximum pixel value: 0.9960784316062927 Mean pixel value: 0.45157885551452637 Standard deviation of pixel values: 0.21337109804153442
The input sizes are now in the range [0,1].
Check Shape of Batches of 32 Images & Label
- Shape should be as follows: [samples][width][height][channel]
# Default shape in TF: N H W C
for image_batch, labels_batch in train_ds_128_dataAug_rescaled:
print('Shape of Each Batch of 32 Images:', image_batch.shape) # batch of 32 images of shape 128 x 128 x 1 color channel
print('Shape of Each Batch 15 Classes:', labels_batch.shape) # batch of 32 images of 15 classes
print('\nFirst Image in Batch:\n', image_batch[0]) # first image in the batch
break
Shape of Each Batch of 32 Images: (32, 128, 128, 1) Shape of Each Batch 15 Classes: (32, 15) First Image in Batch: tf.Tensor( [[[0.21695852] [0.21750005] [0.2542193 ] ... [0.35875082] [0.36177886] [0.35930458]] [[0.19724315] [0.1998232 ] [0.22806044] ... [0.36083373] [0.36221662] [0.35816777]] [[0.1833131 ] [0.18680121] [0.20767565] ... [0.37250146] [0.37248576] [0.36548913]] ... [[0.41363707] [0.411267 ] [0.40676948] ... [0.61402327] [0.5958905 ] [0.59446615]] [[0.40693903] [0.39742497] [0.3899856 ] ... [0.5802054 ] [0.5755106 ] [0.5780593 ]] [[0.40645716] [0.39580986] [0.38733116] ... [0.53335845] [0.5367797 ] [0.5411091 ]]], shape=(128, 128, 1), dtype=float32)
4. Modelling
- Explanation of Layers used:
- Convolutional
- Extract features from 2D image with relu activation function
- MaxPooling2D
- Downsample the image features
- Flatten
- Flatten data from 2D array to 1D array
- Dropout
- Reduce overfitting by randomly dropping out a number of output units from the layer during the training process
- Dense:
- Capture more information
- Batch Normalization
- Purpose: Standardizes the input of layers across a single batch
- Why?: Issue of internal covariate shift, whereby during training, parameters change, hence activations in intermediate layers are constantly changing
- Importance: Normalizes input of each layer to reduce internal covariate shift. It helps to speed up training & use higher learning rates, making learning easier.
- Convolutional
Source:
# Fix random seed for reproducibility
seed = 42
np.random.seed(seed)
Evaluation Methodology:
- Accuracy Curve
- The gap between training & validation accuracy is a clear indication of overfitting. The larger the gap, the higher the overfitting.

- Loss Curve

Source: https://towardsdatascience.com/useful-plots-to-diagnose-your-neural-network-521907fa2f45
Utility Functions
# Function to plot comparison between train & validation metrics
def plotCompareMetrics(history):
# Plot train vs validation metric per epoch
plt.figure(figsize = (15, 5))
plt.subplot(1, 2, 1)
plt.grid(True)
# Accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
# Make it pretty
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend(['Train', 'Validation'], loc = 'best')
plt.subplot(1, 2, 2)
plt.grid(True)
# Loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
# Make it pretty
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend(['Train', 'Validation'], loc = 'best')
plt.show()
# Function to plot confusion matrix
def plot_confusion_matrix(y_true, y_preds, class_names):
cm = confusion_matrix(y_true, y_preds)
plt.figure(figsize=(15,9))
sns.heatmap(cm, annot=True, fmt='.0f')
plt.ylabel("True values",size=20)
plt.xlabel('Predicted values',size=20)
plt.xticks(ticks=np.arange(len(class_names))+0.5,labels=class_names,rotation=60)
plt.yticks(ticks=np.arange(len(class_names))+0.5,labels=class_names,rotation=0)
# Function ot Make predictions on test dataset
def makePredictions(model, test_dataset, class_names):
test_ds_images = []
test_ds_labels = []
for image_batch, labels_batch in test_dataset.unbatch().as_numpy_iterator():
test_ds_images.append(image_batch)
test_ds_labels.append(labels_batch)
# Currently, test_ds_images & test_ds_labels are a list of NumPy arrays
# Convert test_ds_images & test_ds_labels into a NumPy array
test_ds_images = np.array(test_ds_images)
test_ds_labels = np.array(test_ds_labels)
# Class probabilities for each sample
y_probs = model.predict(test_ds_images) # Contains the probabilities for each class for each sample
# Convert prediction probabilities to predicted class labels, convert one-hot encoded labels to integers
y_preds = np.argmax(y_probs, axis=1)
# Extract true labels from test dataset, convert one-hot encoded labels to integers
y_labels = np.argmax(test_ds_labels, axis=1)
# Store correctly classified images
true_class = list()
# Store worngly classified images
false_class = list()
for i in range(len(test_ds_images)):
if y_preds[i] != y_labels[i]:
false_class.append((test_ds_images[i], y_preds[i], y_labels[i]))
else:
true_class.append((test_ds_images[i], y_preds[i], y_labels[i]))
return y_labels, y_preds, true_class, false_class
# Function to display images with true and predicted labels
def plot_prediction(predicted_images):
random.shuffle(predicted_images)
plt.figure(figsize=(10, 10))
for i in range(1, 17):
plt.subplot(4, 4, i)
plt.imshow(predicted_images[i][0].astype("float"), cmap='gray')
plt.title(f'True: {class_names[predicted_images[i][2]]}\nPredicted: {class_names[predicted_images[i][1]]}')
plt.axis('off')
plt.tight_layout()
Callbacks
EarlyStopping monitor 'val_loss'
# Instantiate an early stopping callback to prevent overfitting
early_stopping = EarlyStopping(monitor = 'val_loss', patience = 10)
Baseline Simple CNN Model
I will start off by buidling a simple baseline CNN model with the purpose to evaluate if data augmentation will be beneficial to our models.
Afterwards, I will improve on my model by making it more complex.
A. Input size: 128 by 128 pixels
Original Rescaled Train Dataset
baseline_model_128_original = Sequential()
# Convolutional Layer
baseline_model_128_original.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Pooling layer
baseline_model_128_original.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
baseline_model_128_original.add(Dropout(0.2))
# Convolutional Layer
baseline_model_128_original.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
baseline_model_128_original.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
baseline_model_128_original.add(Flatten())
# Fully connected layer
baseline_model_128_original.add(Dense(128, activation='relu'))
baseline_model_128_original.add(Dense(num_classes, activation='softmax'))
# Compile basedline_model_128
baseline_model_128_original.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
baseline_model_128_original.summary()
Model: "sequential_12"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_44 (Conv2D) (None, 126, 126, 32) 320
max_pooling2d_23 (MaxPoolin (None, 63, 63, 32) 0
g2D)
dropout_29 (Dropout) (None, 63, 63, 32) 0
conv2d_45 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_24 (MaxPoolin (None, 30, 30, 64) 0
g2D)
flatten_8 (Flatten) (None, 57600) 0
dense_16 (Dense) (None, 128) 7372928
dense_17 (Dense) (None, 15) 1935
=================================================================
Total params: 7,393,679
Trainable params: 7,393,679
Non-trainable params: 0
_________________________________________________________________
Architecture of Baseline Simple CNN Model
tf.keras.utils.plot_model(baseline_model_128_original, show_shapes=True)
# Fit model with original train dataset
history_baseline_model_128_original = baseline_model_128_original.fit(train_ds_128_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save weights
baseline_model_128_original.save_weights('baseline_model_128_rescaled_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_baseline_model_128_original)
# Final evaluation of model
scores = baseline_model_128_original.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 283/283 [==============================] - 4s 11ms/step - loss: 2.0959 - accuracy: 0.3423 - val_loss: 1.7173 - val_accuracy: 0.4763 Epoch 2/30 283/283 [==============================] - 3s 11ms/step - loss: 1.2845 - accuracy: 0.6071 - val_loss: 1.2348 - val_accuracy: 0.6210 Epoch 3/30 283/283 [==============================] - 3s 11ms/step - loss: 0.8065 - accuracy: 0.7543 - val_loss: 0.9968 - val_accuracy: 0.7003 Epoch 4/30 283/283 [==============================] - 3s 11ms/step - loss: 0.4663 - accuracy: 0.8605 - val_loss: 1.0424 - val_accuracy: 0.6977 Epoch 5/30 283/283 [==============================] - 3s 11ms/step - loss: 0.2595 - accuracy: 0.9278 - val_loss: 1.0282 - val_accuracy: 0.7363 Epoch 6/30 283/283 [==============================] - 3s 11ms/step - loss: 0.1444 - accuracy: 0.9596 - val_loss: 1.0539 - val_accuracy: 0.7283 Epoch 7/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0726 - accuracy: 0.9815 - val_loss: 1.0097 - val_accuracy: 0.7693 Epoch 8/30 283/283 [==============================] - 3s 12ms/step - loss: 0.0646 - accuracy: 0.9832 - val_loss: 1.1411 - val_accuracy: 0.7433 Epoch 9/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0771 - accuracy: 0.9754 - val_loss: 1.0647 - val_accuracy: 0.7690 Epoch 10/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0482 - accuracy: 0.9874 - val_loss: 1.2307 - val_accuracy: 0.7380 Epoch 11/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0242 - accuracy: 0.9944 - val_loss: 1.1119 - val_accuracy: 0.7817 Epoch 12/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0064 - accuracy: 0.9992 - val_loss: 1.3605 - val_accuracy: 0.7557 Epoch 13/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0614 - accuracy: 0.9812 - val_loss: 1.4350 - val_accuracy: 0.7133 Epoch 14/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0420 - accuracy: 0.9879 - val_loss: 1.4162 - val_accuracy: 0.7453 Epoch 15/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0281 - accuracy: 0.9921 - val_loss: 1.2797 - val_accuracy: 0.7717 Epoch 16/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0060 - accuracy: 0.9992 - val_loss: 1.3244 - val_accuracy: 0.7773 Epoch 17/30 283/283 [==============================] - 3s 11ms/step - loss: 0.0014 - accuracy: 1.0000 - val_loss: 1.3029 - val_accuracy: 0.7847 Epoch 18/30 283/283 [==============================] - 3s 11ms/step - loss: 8.5760e-04 - accuracy: 1.0000 - val_loss: 1.3263 - val_accuracy: 0.7870 Epoch 19/30 283/283 [==============================] - 3s 11ms/step - loss: 5.7047e-04 - accuracy: 1.0000 - val_loss: 1.3632 - val_accuracy: 0.7843 Epoch 20/30 283/283 [==============================] - 3s 11ms/step - loss: 4.1325e-04 - accuracy: 1.0000 - val_loss: 1.3709 - val_accuracy: 0.7873 Epoch 21/30 283/283 [==============================] - 3s 11ms/step - loss: 2.9994e-04 - accuracy: 1.0000 - val_loss: 1.3819 - val_accuracy: 0.7870 Epoch 22/30 283/283 [==============================] - 3s 11ms/step - loss: 2.5057e-04 - accuracy: 1.0000 - val_loss: 1.4174 - val_accuracy: 0.7887 Epoch 23/30 283/283 [==============================] - 3s 11ms/step - loss: 2.3203e-04 - accuracy: 1.0000 - val_loss: 1.4394 - val_accuracy: 0.7880 Epoch 24/30 283/283 [==============================] - 3s 11ms/step - loss: 1.7456e-04 - accuracy: 1.0000 - val_loss: 1.4498 - val_accuracy: 0.7890 Epoch 25/30 283/283 [==============================] - 3s 11ms/step - loss: 1.4235e-04 - accuracy: 1.0000 - val_loss: 1.4645 - val_accuracy: 0.7893 Epoch 26/30 283/283 [==============================] - 3s 11ms/step - loss: 1.2458e-04 - accuracy: 1.0000 - val_loss: 1.4832 - val_accuracy: 0.7903 Epoch 27/30 283/283 [==============================] - 3s 11ms/step - loss: 1.0901e-04 - accuracy: 1.0000 - val_loss: 1.5035 - val_accuracy: 0.7913 Epoch 28/30 283/283 [==============================] - 3s 11ms/step - loss: 9.5134e-05 - accuracy: 1.0000 - val_loss: 1.5095 - val_accuracy: 0.7913 Epoch 29/30 283/283 [==============================] - 3s 11ms/step - loss: 7.7541e-05 - accuracy: 1.0000 - val_loss: 1.5267 - val_accuracy: 0.7907 Epoch 30/30 283/283 [==============================] - 3s 11ms/step - loss: 7.0626e-05 - accuracy: 1.0000 - val_loss: 1.5419 - val_accuracy: 0.7917
Loss on Test Dataset: 1.4051% Accuracy on Test Dataset: 80.33% CNN Error on Test Dataset: 19.67%
Insights for Accuracy:
In the plots above, the training accuracy is increasing linearly over time, whereas validation accuracy stalls around 75% in the training process.
There is a big gap in accuracy between training and validation, showing signs of overfitting. The model hence has a difficult time generalising on a new dataset.
As we have seen in the Exploratory Data Analysis, with a small number of training examples, the model sometimes learns from noises or unwanted details from training examples. This might have led to the negative impacts on the performance of the model on new examples.
Insights for Loss:
Simiarly, we see that the loss for train & validation starts to diverage & there is a big gap.
Rescaled & Data Augmentated Train Dataset
baseline_model_128_dataAug = Sequential()
# Convolutional Layer
baseline_model_128_dataAug.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Pooling layer
baseline_model_128_dataAug.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
baseline_model_128_dataAug.add(Dropout(0.2))
# Convolutional Layer
baseline_model_128_dataAug.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
baseline_model_128_dataAug.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
baseline_model_128_dataAug.add(Flatten())
# Fully connected layer
baseline_model_128_dataAug.add(Dense(128, activation='relu'))
baseline_model_128_dataAug.add(Dense(num_classes, activation='softmax'))
# Compile basedline_model_128
baseline_model_128_dataAug.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
baseline_model_128_dataAug.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 126, 126, 32) 320
max_pooling2d (MaxPooling2D (None, 63, 63, 32) 0
)
dropout (Dropout) (None, 63, 63, 32) 0
conv2d_1 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_1 (MaxPooling (None, 30, 30, 64) 0
2D)
flatten (Flatten) (None, 57600) 0
dense (Dense) (None, 128) 7372928
dense_1 (Dense) (None, 15) 1935
=================================================================
Total params: 7,393,679
Trainable params: 7,393,679
Non-trainable params: 0
_________________________________________________________________
# Fit model with rescaled & data augmentated train dataset
history_baseline_model_128_dataAug = baseline_model_128_dataAug.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# save weights
baseline_model_128_dataAug.save_weights('baseline_model_128_dataAug_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_baseline_model_128_dataAug)
# Final evaluation of model
scores = baseline_model_128_dataAug.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 12s 12ms/step - loss: 2.0684 - accuracy: 0.3405 - val_loss: 1.3736 - val_accuracy: 0.5873 Epoch 2/30 566/566 [==============================] - 7s 12ms/step - loss: 1.3127 - accuracy: 0.5881 - val_loss: 0.9857 - val_accuracy: 0.6883 Epoch 3/30 566/566 [==============================] - 7s 12ms/step - loss: 0.9742 - accuracy: 0.6938 - val_loss: 0.7624 - val_accuracy: 0.7697 Epoch 4/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7857 - accuracy: 0.7537 - val_loss: 0.7085 - val_accuracy: 0.7990 Epoch 5/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6881 - accuracy: 0.7882 - val_loss: 0.6580 - val_accuracy: 0.8160 Epoch 6/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6122 - accuracy: 0.8117 - val_loss: 0.6141 - val_accuracy: 0.8243 Epoch 7/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5363 - accuracy: 0.8355 - val_loss: 0.5442 - val_accuracy: 0.8463 Epoch 8/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4968 - accuracy: 0.8465 - val_loss: 0.5266 - val_accuracy: 0.8523 Epoch 9/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4656 - accuracy: 0.8553 - val_loss: 0.4959 - val_accuracy: 0.8580 Epoch 10/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4457 - accuracy: 0.8647 - val_loss: 0.4830 - val_accuracy: 0.8693 Epoch 11/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4230 - accuracy: 0.8709 - val_loss: 0.4389 - val_accuracy: 0.8797 Epoch 12/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3860 - accuracy: 0.8787 - val_loss: 0.4365 - val_accuracy: 0.8807 Epoch 13/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3880 - accuracy: 0.8789 - val_loss: 0.4123 - val_accuracy: 0.8867 Epoch 14/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3615 - accuracy: 0.8886 - val_loss: 0.4044 - val_accuracy: 0.8887 Epoch 15/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3253 - accuracy: 0.8970 - val_loss: 0.3818 - val_accuracy: 0.8960 Epoch 16/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3272 - accuracy: 0.9000 - val_loss: 0.3789 - val_accuracy: 0.8960 Epoch 17/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3058 - accuracy: 0.9064 - val_loss: 0.3654 - val_accuracy: 0.9007 Epoch 18/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3022 - accuracy: 0.9055 - val_loss: 0.3406 - val_accuracy: 0.9030 Epoch 19/30 566/566 [==============================] - 7s 13ms/step - loss: 0.2965 - accuracy: 0.9081 - val_loss: 0.3404 - val_accuracy: 0.9060 Epoch 20/30 566/566 [==============================] - 7s 13ms/step - loss: 0.2714 - accuracy: 0.9154 - val_loss: 0.3308 - val_accuracy: 0.9077 Epoch 21/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2699 - accuracy: 0.9170 - val_loss: 0.3264 - val_accuracy: 0.9113 Epoch 22/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2632 - accuracy: 0.9188 - val_loss: 0.3178 - val_accuracy: 0.9080 Epoch 23/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2497 - accuracy: 0.9231 - val_loss: 0.2968 - val_accuracy: 0.9160 Epoch 24/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2465 - accuracy: 0.9218 - val_loss: 0.2997 - val_accuracy: 0.9163 Epoch 25/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2335 - accuracy: 0.9277 - val_loss: 0.3169 - val_accuracy: 0.9137 Epoch 26/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2248 - accuracy: 0.9292 - val_loss: 0.2976 - val_accuracy: 0.9200 Epoch 27/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2214 - accuracy: 0.9286 - val_loss: 0.2920 - val_accuracy: 0.9220 Epoch 28/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2121 - accuracy: 0.9333 - val_loss: 0.2894 - val_accuracy: 0.9227 Epoch 29/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2017 - accuracy: 0.9382 - val_loss: 0.3105 - val_accuracy: 0.9193 Epoch 30/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2166 - accuracy: 0.9317 - val_loss: 0.2858 - val_accuracy: 0.9243
Loss on Test Dataset: 0.2692% Accuracy on Test Dataset: 91.90% CNN Error on Test Dataset: 8.10%
Insights:
- Less overfitting with rescaled & data augmentated train dataset.
- The gap between the train & validation metrics is smaller for training with rescaled & data augmentated trian dataset.
- Test accuracy has improved as compared to using the original rescaled dataset.
- However, the trade off is that the train accuracy has decreased as compared to using the original rescaled dataset.
Hence, data augmentation has proven to be beneficial as it has improved the performance of the model by reducing overfitting by introducing more data to learn from to prevent generalization.
B. Input size: 31 by 31 pixels
Original Rescaled Train Dataset
baseline_model_31_original = Sequential()
# Convolutional Layer
baseline_model_31_original.add(Conv2D(32, (3,3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
baseline_model_31_original.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
baseline_model_31_original.add(Dropout(0.2))
# Convolutional Layer
baseline_model_31_original.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
baseline_model_31_original.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
baseline_model_31_original.add(Flatten())
# Fully connected layer
baseline_model_31_original.add(Dense(128, activation='relu'))
baseline_model_31_original.add(Dense(num_classes, activation='softmax'))
# Compile baseline_model_31_original
baseline_model_31_original.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the baseline_model_31_original summary
baseline_model_31_original.summary()
Model: "sequential_14"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_48 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_27 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_31 (Dropout) (None, 14, 14, 32) 0
conv2d_49 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_28 (MaxPoolin (None, 6, 6, 64) 0
g2D)
flatten_10 (Flatten) (None, 2304) 0
dense_20 (Dense) (None, 128) 295040
dense_21 (Dense) (None, 15) 1935
=================================================================
Total params: 315,791
Trainable params: 315,791
Non-trainable params: 0
_________________________________________________________________
# Fit model with original train dataset
history_basedline_model_31_original = baseline_model_31_original.fit(train_ds_31_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# save weights
baseline_model_31_original.save_weights('baseline_model_31_rescaled_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_basedline_model_31_original)
# Final evaluation of model
scores = baseline_model_31_original.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 283/283 [==============================] - 2s 6ms/step - loss: 2.3446 - accuracy: 0.2335 - val_loss: 1.9836 - val_accuracy: 0.3817 Epoch 2/30 283/283 [==============================] - 2s 6ms/step - loss: 1.6670 - accuracy: 0.4670 - val_loss: 1.6116 - val_accuracy: 0.4783 Epoch 3/30 283/283 [==============================] - 2s 6ms/step - loss: 1.3370 - accuracy: 0.5712 - val_loss: 1.3778 - val_accuracy: 0.5643 Epoch 4/30 283/283 [==============================] - 2s 6ms/step - loss: 1.1002 - accuracy: 0.6485 - val_loss: 1.2225 - val_accuracy: 0.5947 Epoch 5/30 283/283 [==============================] - 2s 6ms/step - loss: 0.9229 - accuracy: 0.7079 - val_loss: 0.9688 - val_accuracy: 0.6857 Epoch 6/30 283/283 [==============================] - 2s 6ms/step - loss: 0.8048 - accuracy: 0.7509 - val_loss: 0.9234 - val_accuracy: 0.7050 Epoch 7/30 283/283 [==============================] - 3s 11ms/step - loss: 0.6806 - accuracy: 0.7915 - val_loss: 0.9027 - val_accuracy: 0.7270 Epoch 8/30 283/283 [==============================] - 3s 10ms/step - loss: 0.5988 - accuracy: 0.8146 - val_loss: 0.8084 - val_accuracy: 0.7553 Epoch 9/30 283/283 [==============================] - 2s 5ms/step - loss: 0.5119 - accuracy: 0.8422 - val_loss: 0.7584 - val_accuracy: 0.7707 Epoch 10/30 283/283 [==============================] - 2s 6ms/step - loss: 0.4509 - accuracy: 0.8613 - val_loss: 0.7014 - val_accuracy: 0.7820 Epoch 11/30 283/283 [==============================] - 2s 6ms/step - loss: 0.3933 - accuracy: 0.8784 - val_loss: 0.7190 - val_accuracy: 0.7840 Epoch 12/30 283/283 [==============================] - 2s 5ms/step - loss: 0.3376 - accuracy: 0.8979 - val_loss: 0.7027 - val_accuracy: 0.7877 Epoch 13/30 283/283 [==============================] - 2s 6ms/step - loss: 0.3058 - accuracy: 0.9047 - val_loss: 0.6783 - val_accuracy: 0.8033 Epoch 14/30 283/283 [==============================] - 2s 6ms/step - loss: 0.2735 - accuracy: 0.9161 - val_loss: 0.9125 - val_accuracy: 0.7367 Epoch 15/30 283/283 [==============================] - 2s 6ms/step - loss: 0.2347 - accuracy: 0.9277 - val_loss: 0.7112 - val_accuracy: 0.8000 Epoch 16/30 283/283 [==============================] - 1s 5ms/step - loss: 0.2052 - accuracy: 0.9380 - val_loss: 0.7342 - val_accuracy: 0.7993 Epoch 17/30 283/283 [==============================] - 1s 5ms/step - loss: 0.1872 - accuracy: 0.9449 - val_loss: 0.6628 - val_accuracy: 0.8153 Epoch 18/30 283/283 [==============================] - 1s 5ms/step - loss: 0.1640 - accuracy: 0.9515 - val_loss: 0.7144 - val_accuracy: 0.8053 Epoch 19/30 283/283 [==============================] - 1s 5ms/step - loss: 0.1402 - accuracy: 0.9586 - val_loss: 0.6238 - val_accuracy: 0.8343 Epoch 20/30 283/283 [==============================] - 1s 5ms/step - loss: 0.1185 - accuracy: 0.9667 - val_loss: 0.7131 - val_accuracy: 0.8157 Epoch 21/30 283/283 [==============================] - 1s 5ms/step - loss: 0.1226 - accuracy: 0.9616 - val_loss: 0.7029 - val_accuracy: 0.8187 Epoch 22/30 283/283 [==============================] - 2s 5ms/step - loss: 0.1167 - accuracy: 0.9649 - val_loss: 0.6628 - val_accuracy: 0.8333 Epoch 23/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0895 - accuracy: 0.9740 - val_loss: 0.6269 - val_accuracy: 0.8440 Epoch 24/30 283/283 [==============================] - 2s 6ms/step - loss: 0.1138 - accuracy: 0.9620 - val_loss: 0.7500 - val_accuracy: 0.8237 Epoch 25/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0993 - accuracy: 0.9690 - val_loss: 0.6495 - val_accuracy: 0.8453 Epoch 26/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0657 - accuracy: 0.9817 - val_loss: 0.6802 - val_accuracy: 0.8370 Epoch 27/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0622 - accuracy: 0.9813 - val_loss: 0.7612 - val_accuracy: 0.8310 Epoch 28/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0758 - accuracy: 0.9768 - val_loss: 0.8724 - val_accuracy: 0.7977 Epoch 29/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0521 - accuracy: 0.9862 - val_loss: 0.6967 - val_accuracy: 0.8463 Epoch 30/30 283/283 [==============================] - 2s 6ms/step - loss: 0.0603 - accuracy: 0.9809 - val_loss: 0.7824 - val_accuracy: 0.8343
Loss on Test Dataset: 0.7183% Accuracy on Test Dataset: 84.53% CNN Error on Test Dataset: 15.47%
Rescaled & Data Augmentated Train Dataset
baseline_model_31_dataAug = Sequential()
# Convolutional Layer
baseline_model_31_dataAug.add(Conv2D(32, (3,3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
baseline_model_31_dataAug.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
baseline_model_31_dataAug.add(Dropout(0.2))
# Convolutional Layer
baseline_model_31_dataAug.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
baseline_model_31_dataAug.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
baseline_model_31_dataAug.add(Flatten())
# Fully connected layer
baseline_model_31_dataAug.add(Dense(128, activation='relu'))
baseline_model_31_dataAug.add(Dense(num_classes, activation='softmax'))
# Compile baseline_model_31_dataAug
baseline_model_31_dataAug.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the baseline_model_31_dataAug summary
baseline_model_31_dataAug.summary()
Model: "sequential_15"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_50 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_29 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_32 (Dropout) (None, 14, 14, 32) 0
conv2d_51 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_30 (MaxPoolin (None, 6, 6, 64) 0
g2D)
flatten_11 (Flatten) (None, 2304) 0
dense_22 (Dense) (None, 128) 295040
dense_23 (Dense) (None, 15) 1935
=================================================================
Total params: 315,791
Trainable params: 315,791
Non-trainable params: 0
_________________________________________________________________
# Fit model with original train dataset
history_baseline_model_31_dataAug = baseline_model_31_dataAug.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# save weights
baseline_model_31_dataAug.save_weights('baseline_model_31_dataAug_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_baseline_model_31_dataAug)
# Final evaluation of model
scores = baseline_model_31_dataAug.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 4s 7ms/step - loss: 2.2004 - accuracy: 0.2873 - val_loss: 1.7515 - val_accuracy: 0.4263 Epoch 2/30 566/566 [==============================] - 4s 7ms/step - loss: 1.7217 - accuracy: 0.4421 - val_loss: 1.3903 - val_accuracy: 0.5513 Epoch 3/30 566/566 [==============================] - 4s 7ms/step - loss: 1.4917 - accuracy: 0.5178 - val_loss: 1.1122 - val_accuracy: 0.6280 Epoch 4/30 566/566 [==============================] - 4s 7ms/step - loss: 1.3155 - accuracy: 0.5714 - val_loss: 0.9217 - val_accuracy: 0.7077 Epoch 5/30 566/566 [==============================] - 4s 7ms/step - loss: 1.1978 - accuracy: 0.6182 - val_loss: 0.9351 - val_accuracy: 0.6910 Epoch 6/30 566/566 [==============================] - 4s 7ms/step - loss: 1.0979 - accuracy: 0.6483 - val_loss: 0.7806 - val_accuracy: 0.7547 Epoch 7/30 566/566 [==============================] - 4s 8ms/step - loss: 1.0193 - accuracy: 0.6763 - val_loss: 0.7221 - val_accuracy: 0.7750 Epoch 8/30 566/566 [==============================] - 4s 8ms/step - loss: 0.9675 - accuracy: 0.6954 - val_loss: 0.7385 - val_accuracy: 0.7730 Epoch 9/30 566/566 [==============================] - 4s 8ms/step - loss: 0.9367 - accuracy: 0.7099 - val_loss: 0.6627 - val_accuracy: 0.8047 Epoch 10/30 566/566 [==============================] - 5s 8ms/step - loss: 0.8479 - accuracy: 0.7296 - val_loss: 0.5880 - val_accuracy: 0.8177 Epoch 11/30 566/566 [==============================] - 4s 8ms/step - loss: 0.8279 - accuracy: 0.7381 - val_loss: 0.5756 - val_accuracy: 0.8200 Epoch 12/30 566/566 [==============================] - 5s 8ms/step - loss: 0.7763 - accuracy: 0.7527 - val_loss: 0.5417 - val_accuracy: 0.8373 Epoch 13/30 566/566 [==============================] - 5s 8ms/step - loss: 0.7471 - accuracy: 0.7635 - val_loss: 0.5505 - val_accuracy: 0.8350 Epoch 14/30 566/566 [==============================] - 4s 8ms/step - loss: 0.7090 - accuracy: 0.7782 - val_loss: 0.5203 - val_accuracy: 0.8447 Epoch 15/30 566/566 [==============================] - 4s 8ms/step - loss: 0.6923 - accuracy: 0.7827 - val_loss: 0.5433 - val_accuracy: 0.8407 Epoch 16/30 566/566 [==============================] - 4s 8ms/step - loss: 0.6439 - accuracy: 0.7967 - val_loss: 0.5184 - val_accuracy: 0.8527 Epoch 17/30 566/566 [==============================] - 5s 8ms/step - loss: 0.6132 - accuracy: 0.8076 - val_loss: 0.4861 - val_accuracy: 0.8547 Epoch 18/30 566/566 [==============================] - 5s 8ms/step - loss: 0.6025 - accuracy: 0.8114 - val_loss: 0.4935 - val_accuracy: 0.8547 Epoch 19/30 566/566 [==============================] - 4s 8ms/step - loss: 0.5865 - accuracy: 0.8187 - val_loss: 0.5245 - val_accuracy: 0.8530 Epoch 20/30 566/566 [==============================] - 4s 8ms/step - loss: 0.5527 - accuracy: 0.8287 - val_loss: 0.4952 - val_accuracy: 0.8647 Epoch 21/30 566/566 [==============================] - 5s 8ms/step - loss: 0.5427 - accuracy: 0.8321 - val_loss: 0.5036 - val_accuracy: 0.8557 Epoch 22/30 566/566 [==============================] - 5s 8ms/step - loss: 0.5160 - accuracy: 0.8412 - val_loss: 0.4962 - val_accuracy: 0.8700 Epoch 23/30 566/566 [==============================] - 4s 8ms/step - loss: 0.5067 - accuracy: 0.8423 - val_loss: 0.5006 - val_accuracy: 0.8607 Epoch 24/30 566/566 [==============================] - 5s 8ms/step - loss: 0.4924 - accuracy: 0.8496 - val_loss: 0.4872 - val_accuracy: 0.8690 Epoch 25/30 566/566 [==============================] - 5s 8ms/step - loss: 0.4596 - accuracy: 0.8568 - val_loss: 0.5000 - val_accuracy: 0.8690 Epoch 26/30 566/566 [==============================] - 5s 8ms/step - loss: 0.4454 - accuracy: 0.8608 - val_loss: 0.4921 - val_accuracy: 0.8633 Epoch 27/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4455 - accuracy: 0.8624 - val_loss: 0.5014 - val_accuracy: 0.8687 Epoch 28/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4437 - accuracy: 0.8612 - val_loss: 0.4912 - val_accuracy: 0.8687 Epoch 29/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4139 - accuracy: 0.8692 - val_loss: 0.4652 - val_accuracy: 0.8767 Epoch 30/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4023 - accuracy: 0.8732 - val_loss: 0.4804 - val_accuracy: 0.8730
Loss on Test Dataset: 0.3994% Accuracy on Test Dataset: 89.13% CNN Error on Test Dataset: 10.87%
Insights:
- Less overfitting with rescaled & data augmentated train dataset.
- The gap between the train & validation metrics is smaller for training with rescaled & data augmentated trian dataset.
- Test accuracy has improved as compared to using the original rescaled dataset.
- Train accuracy has decreased as compared to using the original rescaled dataset.
Hence, data augmentation has proven to be beneficial for 31x31 input size as it has improved the performance of the model by reducing overfitting by introducing more data to learn from to prevent generalization.
More Complex CNN Model
Larger CNN architecture with additional convolutional, max pooling, dropout, batch normalization layers and fully connected layers with more neurons.
As seen above, I will work towards improving both the train and test accuracies while reducing overfitting.
A. Input size: 128 by 128 pixels
- First off, I will start by increasing Dropout form 0.2 to 0.4.
complex_model_128_dropOut = Sequential()
# Convolutional Layer
complex_model_128_dropOut.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Pooling layer
complex_model_128_dropOut.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_dropOut.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_dropOut.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_dropOut.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
complex_model_128_dropOut.add(Flatten())
# Fully connected layer
complex_model_128_dropOut.add(Dense(128, activation='relu'))
complex_model_128_dropOut.add(Dense(num_classes, activation='softmax'))
# Compile basedline_model_128
complex_model_128_dropOut.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_dropOut.summary()
Model: "sequential_16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_52 (Conv2D) (None, 126, 126, 32) 320
max_pooling2d_31 (MaxPoolin (None, 63, 63, 32) 0
g2D)
dropout_33 (Dropout) (None, 63, 63, 32) 0
conv2d_53 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_32 (MaxPoolin (None, 30, 30, 64) 0
g2D)
flatten_12 (Flatten) (None, 57600) 0
dense_24 (Dense) (None, 128) 7372928
dense_25 (Dense) (None, 15) 1935
=================================================================
Total params: 7,393,679
Trainable params: 7,393,679
Non-trainable params: 0
_________________________________________________________________
# Fit model with rescaled & data augmentated train dataset
history_complex_model_128_dropOut = complex_model_128_dropOut.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# save weights
complex_model_128_dropOut.save_weights('complex_model_128_dropOut_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_dropOut)
# Final evaluation of model
scores = complex_model_128_dropOut.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 8s 12ms/step - loss: 2.2090 - accuracy: 0.3037 - val_loss: 1.4984 - val_accuracy: 0.5653 Epoch 2/30 566/566 [==============================] - 7s 13ms/step - loss: 1.3787 - accuracy: 0.5640 - val_loss: 0.9402 - val_accuracy: 0.7140 Epoch 3/30 566/566 [==============================] - 8s 13ms/step - loss: 1.0277 - accuracy: 0.6787 - val_loss: 0.7504 - val_accuracy: 0.7757 Epoch 4/30 566/566 [==============================] - 8s 14ms/step - loss: 0.8249 - accuracy: 0.7423 - val_loss: 0.6618 - val_accuracy: 0.8093 Epoch 5/30 566/566 [==============================] - 7s 13ms/step - loss: 0.6989 - accuracy: 0.7814 - val_loss: 0.6205 - val_accuracy: 0.8233 Epoch 6/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6195 - accuracy: 0.8085 - val_loss: 0.5750 - val_accuracy: 0.8357 Epoch 7/30 566/566 [==============================] - 7s 13ms/step - loss: 0.5903 - accuracy: 0.8215 - val_loss: 0.4916 - val_accuracy: 0.8580 Epoch 8/30 566/566 [==============================] - 8s 13ms/step - loss: 0.5535 - accuracy: 0.8313 - val_loss: 0.4942 - val_accuracy: 0.8600 Epoch 9/30 566/566 [==============================] - 7s 13ms/step - loss: 0.5005 - accuracy: 0.8487 - val_loss: 0.4288 - val_accuracy: 0.8723 Epoch 10/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4732 - accuracy: 0.8561 - val_loss: 0.4074 - val_accuracy: 0.8797 Epoch 11/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4582 - accuracy: 0.8583 - val_loss: 0.4014 - val_accuracy: 0.8783 Epoch 12/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4307 - accuracy: 0.8671 - val_loss: 0.4235 - val_accuracy: 0.8790 Epoch 13/30 566/566 [==============================] - 7s 12ms/step - loss: 0.4001 - accuracy: 0.8773 - val_loss: 0.4168 - val_accuracy: 0.8823 Epoch 14/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3799 - accuracy: 0.8824 - val_loss: 0.3958 - val_accuracy: 0.8887 Epoch 15/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3891 - accuracy: 0.8815 - val_loss: 0.3677 - val_accuracy: 0.8977 Epoch 16/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3814 - accuracy: 0.8844 - val_loss: 0.3615 - val_accuracy: 0.8990 Epoch 17/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3423 - accuracy: 0.8930 - val_loss: 0.3532 - val_accuracy: 0.9027 Epoch 18/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3454 - accuracy: 0.8945 - val_loss: 0.3801 - val_accuracy: 0.8953 Epoch 19/30 566/566 [==============================] - 7s 12ms/step - loss: 0.3249 - accuracy: 0.9013 - val_loss: 0.3626 - val_accuracy: 0.9017 Epoch 20/30 566/566 [==============================] - 7s 13ms/step - loss: 0.2992 - accuracy: 0.9070 - val_loss: 0.3459 - val_accuracy: 0.9033 Epoch 21/30 566/566 [==============================] - 9s 15ms/step - loss: 0.3040 - accuracy: 0.9067 - val_loss: 0.3380 - val_accuracy: 0.9030 Epoch 22/30 566/566 [==============================] - 8s 13ms/step - loss: 0.3062 - accuracy: 0.9047 - val_loss: 0.3234 - val_accuracy: 0.9090 Epoch 23/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2932 - accuracy: 0.9086 - val_loss: 0.3191 - val_accuracy: 0.9097 Epoch 24/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2774 - accuracy: 0.9127 - val_loss: 0.3026 - val_accuracy: 0.9090 Epoch 25/30 566/566 [==============================] - 7s 13ms/step - loss: 0.2705 - accuracy: 0.9175 - val_loss: 0.3004 - val_accuracy: 0.9127 Epoch 26/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2666 - accuracy: 0.9185 - val_loss: 0.3090 - val_accuracy: 0.9170 Epoch 27/30 566/566 [==============================] - 7s 13ms/step - loss: 0.2502 - accuracy: 0.9228 - val_loss: 0.2897 - val_accuracy: 0.9160 Epoch 28/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2750 - accuracy: 0.9140 - val_loss: 0.2727 - val_accuracy: 0.9187 Epoch 29/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2382 - accuracy: 0.9269 - val_loss: 0.2809 - val_accuracy: 0.9190 Epoch 30/30 566/566 [==============================] - 7s 12ms/step - loss: 0.2430 - accuracy: 0.9234 - val_loss: 0.2768 - val_accuracy: 0.9187
Loss on Test Dataset: 0.2564% Accuracy on Test Dataset: 92.13% CNN Error on Test Dataset: 7.87%
Insights:
Even though there is a minimum difference, we can tell that the dropout layer helps to reduce overfitting by a little bit, as the gap between the train and validation is slightly smaller.
The test accuray has increased while the train accuracy has decreased.
- Now, I am going to add one more convolutional layer with more neurons and add dropout layers after each convolutional and dense layer. I will also be adding in one BatchNormalization layer.
I have chosen to set the number of filters to be increasing per Conv2D layer.
This is because the input the model receives is the raw pixel data, which is noisy, hence I will let the CNN first extract some relevant information from the noisy, "dirty" raw pixel data.
Once the useful features have been extracted, I will then make the CNN elaborate more complex abstractions on it.
Hence, as the network gets deeper, the number of filters increases.
complex_model_128_2 = Sequential()
# Convolutional Layer
complex_model_128_2.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Pooling layer
complex_model_128_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_2.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_2.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_2.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_2.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_2.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_2.add(Flatten())
# Fully connected layer
complex_model_128_2.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_2.add(BatchNormalization())
# Dropout layer
complex_model_128_2.add(Dropout(0.4))
# Outout layer
complex_model_128_2.add(Dense(num_classes, activation='softmax'))
# Compile basedline_model_128
complex_model_128_2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_2.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 126, 126, 32) 320
max_pooling2d_2 (MaxPooling (None, 63, 63, 32) 0
2D)
dropout_1 (Dropout) (None, 63, 63, 32) 0
conv2d_3 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_3 (MaxPooling (None, 30, 30, 64) 0
2D)
dropout_2 (Dropout) (None, 30, 30, 64) 0
conv2d_4 (Conv2D) (None, 28, 28, 128) 73856
max_pooling2d_4 (MaxPooling (None, 14, 14, 128) 0
2D)
dropout_3 (Dropout) (None, 14, 14, 128) 0
flatten_1 (Flatten) (None, 25088) 0
dense_2 (Dense) (None, 128) 3211392
batch_normalization (BatchN (None, 128) 512
ormalization)
dropout_4 (Dropout) (None, 128) 0
dense_3 (Dense) (None, 15) 1935
=================================================================
Total params: 3,306,511
Trainable params: 3,306,255
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_2 = complex_model_128_2.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_2.save_weights('complex_model_128_2_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_2)
# Final evaluation of model
scores = complex_model_128_2.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 12s 19ms/step - loss: 2.2059 - accuracy: 0.2933 - val_loss: 3.3519 - val_accuracy: 0.1073 Epoch 2/30 566/566 [==============================] - 8s 14ms/step - loss: 1.5971 - accuracy: 0.4843 - val_loss: 3.3179 - val_accuracy: 0.1833 Epoch 3/30 566/566 [==============================] - 8s 13ms/step - loss: 1.3436 - accuracy: 0.5652 - val_loss: 1.6356 - val_accuracy: 0.4847 Epoch 4/30 566/566 [==============================] - 7s 13ms/step - loss: 1.1794 - accuracy: 0.6272 - val_loss: 2.0793 - val_accuracy: 0.3727 Epoch 5/30 566/566 [==============================] - 7s 13ms/step - loss: 0.9997 - accuracy: 0.6854 - val_loss: 1.1398 - val_accuracy: 0.6553 Epoch 6/30 566/566 [==============================] - 7s 13ms/step - loss: 0.8999 - accuracy: 0.7185 - val_loss: 1.1622 - val_accuracy: 0.6100 Epoch 7/30 566/566 [==============================] - 7s 13ms/step - loss: 0.7996 - accuracy: 0.7500 - val_loss: 1.3758 - val_accuracy: 0.5733 Epoch 8/30 566/566 [==============================] - 7s 13ms/step - loss: 0.7531 - accuracy: 0.7627 - val_loss: 0.7433 - val_accuracy: 0.7620 Epoch 9/30 566/566 [==============================] - 7s 13ms/step - loss: 0.6444 - accuracy: 0.7972 - val_loss: 1.0167 - val_accuracy: 0.6910 Epoch 10/30 566/566 [==============================] - 7s 13ms/step - loss: 0.5893 - accuracy: 0.8166 - val_loss: 0.7508 - val_accuracy: 0.7650 Epoch 11/30 566/566 [==============================] - 7s 13ms/step - loss: 0.5739 - accuracy: 0.8192 - val_loss: 0.5496 - val_accuracy: 0.8203 Epoch 12/30 566/566 [==============================] - 7s 13ms/step - loss: 0.5258 - accuracy: 0.8358 - val_loss: 0.4680 - val_accuracy: 0.8503 Epoch 13/30 566/566 [==============================] - 8s 13ms/step - loss: 0.5198 - accuracy: 0.8343 - val_loss: 0.3775 - val_accuracy: 0.8910 Epoch 14/30 566/566 [==============================] - 7s 13ms/step - loss: 0.4893 - accuracy: 0.8489 - val_loss: 0.6831 - val_accuracy: 0.7843 Epoch 15/30 566/566 [==============================] - 7s 13ms/step - loss: 0.4621 - accuracy: 0.8555 - val_loss: 0.4789 - val_accuracy: 0.8480 Epoch 16/30 566/566 [==============================] - 7s 13ms/step - loss: 0.4293 - accuracy: 0.8686 - val_loss: 0.4378 - val_accuracy: 0.8627 Epoch 17/30 566/566 [==============================] - 7s 13ms/step - loss: 0.4225 - accuracy: 0.8693 - val_loss: 0.6247 - val_accuracy: 0.8017 Epoch 18/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3945 - accuracy: 0.8766 - val_loss: 0.4573 - val_accuracy: 0.8633 Epoch 19/30 566/566 [==============================] - 8s 14ms/step - loss: 0.3888 - accuracy: 0.8777 - val_loss: 0.2835 - val_accuracy: 0.9123 Epoch 20/30 566/566 [==============================] - 8s 13ms/step - loss: 0.3690 - accuracy: 0.8825 - val_loss: 0.3411 - val_accuracy: 0.8983 Epoch 21/30 566/566 [==============================] - 8s 13ms/step - loss: 0.3824 - accuracy: 0.8820 - val_loss: 0.3616 - val_accuracy: 0.8957 Epoch 22/30 566/566 [==============================] - 7s 13ms/step - loss: 0.3557 - accuracy: 0.8902 - val_loss: 0.4360 - val_accuracy: 0.8650 Epoch 23/30 566/566 [==============================] - 26s 47ms/step - loss: 0.3437 - accuracy: 0.8926 - val_loss: 0.4108 - val_accuracy: 0.8700 Epoch 24/30 566/566 [==============================] - 43s 77ms/step - loss: 0.3397 - accuracy: 0.8970 - val_loss: 0.2360 - val_accuracy: 0.9270 Epoch 25/30 566/566 [==============================] - 16s 28ms/step - loss: 0.3188 - accuracy: 0.8989 - val_loss: 0.3468 - val_accuracy: 0.8923 Epoch 26/30 566/566 [==============================] - 17s 30ms/step - loss: 0.3069 - accuracy: 0.9060 - val_loss: 0.2474 - val_accuracy: 0.9250 Epoch 27/30 566/566 [==============================] - 16s 28ms/step - loss: 0.3069 - accuracy: 0.9036 - val_loss: 0.3043 - val_accuracy: 0.9073 Epoch 28/30 566/566 [==============================] - 16s 28ms/step - loss: 0.2982 - accuracy: 0.9074 - val_loss: 0.1937 - val_accuracy: 0.9420 Epoch 29/30 566/566 [==============================] - 18s 31ms/step - loss: 0.2864 - accuracy: 0.9123 - val_loss: 0.2498 - val_accuracy: 0.9287 Epoch 30/30 566/566 [==============================] - 17s 31ms/step - loss: 0.2784 - accuracy: 0.9115 - val_loss: 0.2834 - val_accuracy: 0.9117
Loss on Test Dataset: 0.2656% Accuracy on Test Dataset: 91.40% CNN Error on Test Dataset: 8.60%
- Next, I am going to make the model even more complex by adding back to back Conv2D layers.
I have taken inspiration from the the architecture of the VGG16 model. I have adjust the architecture to suit my avaiable resources and the dataset.
complex_model_128_3 = Sequential()
# Convolutional Layer
complex_model_128_3.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Convolutional Layer
complex_model_128_3.add(Conv2D(32, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3.add(Conv2D(64, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3.add(Conv2D(128, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_3.add(Flatten())
# Fully connected layer
complex_model_128_3.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_3.add(BatchNormalization())
# Dropout layer
complex_model_128_3.add(Dropout(0.4))
# Outout layer
complex_model_128_3.add(Dense(num_classes, activation='softmax'))
# Compile basedline_model_128
complex_model_128_3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_3.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_5 (Conv2D) (None, 126, 126, 32) 320
conv2d_6 (Conv2D) (None, 124, 124, 32) 9248
max_pooling2d_5 (MaxPooling (None, 62, 62, 32) 0
2D)
dropout_5 (Dropout) (None, 62, 62, 32) 0
conv2d_7 (Conv2D) (None, 60, 60, 64) 18496
conv2d_8 (Conv2D) (None, 58, 58, 64) 36928
max_pooling2d_6 (MaxPooling (None, 29, 29, 64) 0
2D)
dropout_6 (Dropout) (None, 29, 29, 64) 0
conv2d_9 (Conv2D) (None, 27, 27, 128) 73856
conv2d_10 (Conv2D) (None, 25, 25, 128) 147584
max_pooling2d_7 (MaxPooling (None, 12, 12, 128) 0
2D)
dropout_7 (Dropout) (None, 12, 12, 128) 0
flatten_2 (Flatten) (None, 18432) 0
dense_4 (Dense) (None, 128) 2359424
batch_normalization_1 (Batc (None, 128) 512
hNormalization)
dropout_8 (Dropout) (None, 128) 0
dense_5 (Dense) (None, 15) 1935
=================================================================
Total params: 2,648,303
Trainable params: 2,648,047
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_3 = complex_model_128_3.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_3.save_weights('complex_model_128_3_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_3)
# Final evaluation of model
scores = complex_model_128_3.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 35s 57ms/step - loss: 2.3749 - accuracy: 0.2401 - val_loss: 2.7497 - val_accuracy: 0.1683 Epoch 2/30 566/566 [==============================] - 31s 55ms/step - loss: 1.9504 - accuracy: 0.3744 - val_loss: 2.5912 - val_accuracy: 0.2793 Epoch 3/30 566/566 [==============================] - 30s 53ms/step - loss: 1.5189 - accuracy: 0.5158 - val_loss: 1.1851 - val_accuracy: 0.6123 Epoch 4/30 566/566 [==============================] - 13s 23ms/step - loss: 1.2033 - accuracy: 0.6171 - val_loss: 1.0343 - val_accuracy: 0.6600 Epoch 5/30 566/566 [==============================] - 12s 21ms/step - loss: 1.1116 - accuracy: 0.6375 - val_loss: 0.9513 - val_accuracy: 0.6923 Epoch 6/30 566/566 [==============================] - 12s 21ms/step - loss: 0.8952 - accuracy: 0.7142 - val_loss: 0.9308 - val_accuracy: 0.6930 Epoch 7/30 566/566 [==============================] - 12s 22ms/step - loss: 0.8056 - accuracy: 0.7458 - val_loss: 0.9341 - val_accuracy: 0.7027 Epoch 8/30 566/566 [==============================] - 12s 21ms/step - loss: 0.6900 - accuracy: 0.7822 - val_loss: 0.6289 - val_accuracy: 0.7903 Epoch 9/30 566/566 [==============================] - 14s 25ms/step - loss: 0.6142 - accuracy: 0.8048 - val_loss: 0.5164 - val_accuracy: 0.8323 Epoch 10/30 566/566 [==============================] - 12s 21ms/step - loss: 0.5583 - accuracy: 0.8239 - val_loss: 0.5578 - val_accuracy: 0.8590 Epoch 11/30 566/566 [==============================] - 12s 21ms/step - loss: 0.5249 - accuracy: 0.8348 - val_loss: 0.4202 - val_accuracy: 0.8600 Epoch 12/30 566/566 [==============================] - 12s 21ms/step - loss: 0.5203 - accuracy: 0.8365 - val_loss: 0.4055 - val_accuracy: 0.8723 Epoch 13/30 566/566 [==============================] - 13s 23ms/step - loss: 0.4385 - accuracy: 0.8631 - val_loss: 0.3787 - val_accuracy: 0.8757 Epoch 14/30 566/566 [==============================] - 12s 21ms/step - loss: 0.4060 - accuracy: 0.8715 - val_loss: 0.3249 - val_accuracy: 0.8947 Epoch 15/30 566/566 [==============================] - 12s 21ms/step - loss: 0.3915 - accuracy: 0.8758 - val_loss: 0.5264 - val_accuracy: 0.8310 Epoch 16/30 566/566 [==============================] - 12s 22ms/step - loss: 0.3691 - accuracy: 0.8830 - val_loss: 0.2513 - val_accuracy: 0.9123 Epoch 17/30 566/566 [==============================] - 13s 22ms/step - loss: 0.3158 - accuracy: 0.9005 - val_loss: 0.2535 - val_accuracy: 0.9180 Epoch 18/30 566/566 [==============================] - 12s 22ms/step - loss: 0.3272 - accuracy: 0.8968 - val_loss: 0.2549 - val_accuracy: 0.9143 Epoch 19/30 566/566 [==============================] - 12s 21ms/step - loss: 0.3036 - accuracy: 0.9031 - val_loss: 0.3393 - val_accuracy: 0.8853 Epoch 20/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2885 - accuracy: 0.9082 - val_loss: 0.3260 - val_accuracy: 0.8967 Epoch 21/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2719 - accuracy: 0.9129 - val_loss: 0.3263 - val_accuracy: 0.8917 Epoch 22/30 566/566 [==============================] - 12s 22ms/step - loss: 0.2784 - accuracy: 0.9130 - val_loss: 0.2367 - val_accuracy: 0.9287 Epoch 23/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2738 - accuracy: 0.9160 - val_loss: 0.3215 - val_accuracy: 0.9010 Epoch 24/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2438 - accuracy: 0.9237 - val_loss: 0.3657 - val_accuracy: 0.8823 Epoch 25/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2453 - accuracy: 0.9246 - val_loss: 0.2257 - val_accuracy: 0.9320 Epoch 26/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2137 - accuracy: 0.9329 - val_loss: 0.2612 - val_accuracy: 0.9207 Epoch 27/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2332 - accuracy: 0.9252 - val_loss: 0.2583 - val_accuracy: 0.9230 Epoch 28/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2236 - accuracy: 0.9286 - val_loss: 0.3117 - val_accuracy: 0.9097 Epoch 29/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2091 - accuracy: 0.9348 - val_loss: 0.2082 - val_accuracy: 0.9340 Epoch 30/30 566/566 [==============================] - 12s 21ms/step - loss: 0.2011 - accuracy: 0.9374 - val_loss: 0.1849 - val_accuracy: 0.9463
Loss on Test Dataset: 0.2089% Accuracy on Test Dataset: 94.37% CNN Error on Test Dataset: 5.63%
Insights:
- After adding back to back Conv2D layers, the model appears to be more stable.
- The test accuracy has also increased.
I will select this complex model as my best model for the 128 x 128 input size model.
B. Input size: 31 by 31 pixels
- Similarly, I will investigate if increasing the dropout size benefits the model.
complex_model_31_dropOut = Sequential()
# Convolutional Layer
complex_model_31_dropOut.add(Conv2D(32, (3,3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_dropOut.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_dropOut.add(Dropout(0.4))
# Convolutional Layer
complex_model_31_dropOut.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_31_dropOut.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the feature map
complex_model_31_dropOut.add(Flatten())
# Fully connected layer
complex_model_31_dropOut.add(Dense(128, activation='relu'))
complex_model_31_dropOut.add(Dense(num_classes, activation='softmax'))
# Compile complex_model_31_dropOut
complex_model_31_dropOut.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the complex_model_31_dropOut summary
complex_model_31_dropOut.summary()
Model: "sequential_17"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_54 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_33 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_34 (Dropout) (None, 14, 14, 32) 0
conv2d_55 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_34 (MaxPoolin (None, 6, 6, 64) 0
g2D)
flatten_13 (Flatten) (None, 2304) 0
dense_26 (Dense) (None, 128) 295040
dense_27 (Dense) (None, 15) 1935
=================================================================
Total params: 315,791
Trainable params: 315,791
Non-trainable params: 0
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_dropOut = complex_model_31_dropOut.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_dropOut.save_weights('complex_model_31_dropOut_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_dropOut)
# Final evaluation of model
scores = complex_model_31_dropOut.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 4s 7ms/step - loss: 2.2391 - accuracy: 0.2743 - val_loss: 1.7968 - val_accuracy: 0.4267 Epoch 2/30 566/566 [==============================] - 4s 8ms/step - loss: 1.7393 - accuracy: 0.4361 - val_loss: 1.3075 - val_accuracy: 0.5803 Epoch 3/30 566/566 [==============================] - 4s 7ms/step - loss: 1.4565 - accuracy: 0.5305 - val_loss: 1.0431 - val_accuracy: 0.6810 Epoch 4/30 566/566 [==============================] - 4s 7ms/step - loss: 1.2941 - accuracy: 0.5858 - val_loss: 0.9232 - val_accuracy: 0.7083 Epoch 5/30 566/566 [==============================] - 4s 7ms/step - loss: 1.1802 - accuracy: 0.6228 - val_loss: 0.8389 - val_accuracy: 0.7410 Epoch 6/30 566/566 [==============================] - 4s 7ms/step - loss: 1.0929 - accuracy: 0.6470 - val_loss: 0.7679 - val_accuracy: 0.7633 Epoch 7/30 566/566 [==============================] - 4s 7ms/step - loss: 1.0130 - accuracy: 0.6740 - val_loss: 0.6973 - val_accuracy: 0.7797 Epoch 8/30 566/566 [==============================] - 4s 7ms/step - loss: 0.9452 - accuracy: 0.7015 - val_loss: 0.6245 - val_accuracy: 0.8027 Epoch 9/30 566/566 [==============================] - 4s 7ms/step - loss: 0.8889 - accuracy: 0.7169 - val_loss: 0.6102 - val_accuracy: 0.8063 Epoch 10/30 566/566 [==============================] - 5s 8ms/step - loss: 0.8275 - accuracy: 0.7352 - val_loss: 0.5724 - val_accuracy: 0.8210 Epoch 11/30 566/566 [==============================] - 4s 7ms/step - loss: 0.7926 - accuracy: 0.7494 - val_loss: 0.5469 - val_accuracy: 0.8297 Epoch 12/30 566/566 [==============================] - 4s 7ms/step - loss: 0.7577 - accuracy: 0.7616 - val_loss: 0.5030 - val_accuracy: 0.8420 Epoch 13/30 566/566 [==============================] - 4s 7ms/step - loss: 0.7358 - accuracy: 0.7675 - val_loss: 0.4844 - val_accuracy: 0.8490 Epoch 14/30 566/566 [==============================] - 4s 7ms/step - loss: 0.6839 - accuracy: 0.7831 - val_loss: 0.4894 - val_accuracy: 0.8463 Epoch 15/30 566/566 [==============================] - 4s 7ms/step - loss: 0.6706 - accuracy: 0.7903 - val_loss: 0.4996 - val_accuracy: 0.8473 Epoch 16/30 566/566 [==============================] - 4s 7ms/step - loss: 0.6460 - accuracy: 0.7971 - val_loss: 0.4907 - val_accuracy: 0.8483 Epoch 17/30 566/566 [==============================] - 4s 7ms/step - loss: 0.6198 - accuracy: 0.8058 - val_loss: 0.4506 - val_accuracy: 0.8643 Epoch 18/30 566/566 [==============================] - 4s 7ms/step - loss: 0.5999 - accuracy: 0.8110 - val_loss: 0.4408 - val_accuracy: 0.8640 Epoch 19/30 566/566 [==============================] - 4s 7ms/step - loss: 0.5564 - accuracy: 0.8260 - val_loss: 0.4446 - val_accuracy: 0.8667 Epoch 20/30 566/566 [==============================] - 4s 7ms/step - loss: 0.5534 - accuracy: 0.8241 - val_loss: 0.4680 - val_accuracy: 0.8623 Epoch 21/30 566/566 [==============================] - 4s 7ms/step - loss: 0.5489 - accuracy: 0.8274 - val_loss: 0.4386 - val_accuracy: 0.8667 Epoch 22/30 566/566 [==============================] - 4s 8ms/step - loss: 0.5429 - accuracy: 0.8314 - val_loss: 0.5105 - val_accuracy: 0.8533 Epoch 23/30 566/566 [==============================] - 4s 8ms/step - loss: 0.5080 - accuracy: 0.8368 - val_loss: 0.4646 - val_accuracy: 0.8683 Epoch 24/30 566/566 [==============================] - 4s 8ms/step - loss: 0.4999 - accuracy: 0.8454 - val_loss: 0.4246 - val_accuracy: 0.8760 Epoch 25/30 566/566 [==============================] - 4s 7ms/step - loss: 0.5006 - accuracy: 0.8422 - val_loss: 0.4406 - val_accuracy: 0.8703 Epoch 26/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4724 - accuracy: 0.8522 - val_loss: 0.4100 - val_accuracy: 0.8823 Epoch 27/30 566/566 [==============================] - 4s 8ms/step - loss: 0.4709 - accuracy: 0.8495 - val_loss: 0.4318 - val_accuracy: 0.8740 Epoch 28/30 566/566 [==============================] - 5s 8ms/step - loss: 0.4591 - accuracy: 0.8563 - val_loss: 0.4210 - val_accuracy: 0.8823 Epoch 29/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4469 - accuracy: 0.8573 - val_loss: 0.4336 - val_accuracy: 0.8757 Epoch 30/30 566/566 [==============================] - 4s 7ms/step - loss: 0.4425 - accuracy: 0.8609 - val_loss: 0.4205 - val_accuracy: 0.8807
Loss on Test Dataset: 0.3994% Accuracy on Test Dataset: 89.43% CNN Error on Test Dataset: 10.57%
Increasing the dropout from 0.2 to 0.4 does slighlty improve the performance of the model.
- Now, I am going to add one more convolutional layer with more neurons and add dropout layers after each convolutional and dense layer. I will also be adding in one BatchNormalization layer.
complex_model_31_2 = Sequential()
# Convolutional Layer
complex_model_31_2.add(Conv2D(32, (3,3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_2.add(Dropout(0.2))
# Convolutional Layer
complex_model_31_2.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_31_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_2.add(Dropout(0.3))
# Convolutional Layer
complex_model_31_2.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_31_2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_2.add(Dropout(0.2))
# Flatten the feature map
complex_model_31_2.add(Flatten())
# Fully connected layer
complex_model_31_2.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_31_2.add(BatchNormalization())
# Dropout layer
complex_model_31_2.add(Dropout(0.2))
# Output layer
complex_model_31_2.add(Dense(num_classes, activation='softmax'))
# Compile complex_model_31_2
complex_model_31_2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the complex_model_31_2 summary
complex_model_31_2.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_2 (MaxPooling (None, 14, 14, 32) 0
2D)
dropout_3 (Dropout) (None, 14, 14, 32) 0
conv2d_3 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_3 (MaxPooling (None, 6, 6, 64) 0
2D)
dropout_4 (Dropout) (None, 6, 6, 64) 0
conv2d_4 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_4 (MaxPooling (None, 2, 2, 128) 0
2D)
dropout_5 (Dropout) (None, 2, 2, 128) 0
flatten_1 (Flatten) (None, 512) 0
dense_2 (Dense) (None, 128) 65664
batch_normalization_1 (Batc (None, 128) 512
hNormalization)
dropout_6 (Dropout) (None, 128) 0
dense_3 (Dense) (None, 15) 1935
=================================================================
Total params: 160,783
Trainable params: 160,527
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_2 = complex_model_31_2.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_2.save_weights('complex_model_31_2_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_2)
# Final evaluation of model
scores = complex_model_31_2.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 8s 12ms/step - loss: 2.1889 - accuracy: 0.2921 - val_loss: 1.8665 - val_accuracy: 0.4590 Epoch 2/30 566/566 [==============================] - 7s 12ms/step - loss: 1.7751 - accuracy: 0.4182 - val_loss: 1.3743 - val_accuracy: 0.5563 Epoch 3/30 566/566 [==============================] - 7s 12ms/step - loss: 1.5469 - accuracy: 0.4930 - val_loss: 1.1487 - val_accuracy: 0.6267 Epoch 4/30 566/566 [==============================] - 6s 11ms/step - loss: 1.3922 - accuracy: 0.5460 - val_loss: 0.9465 - val_accuracy: 0.6883 Epoch 5/30 566/566 [==============================] - 6s 11ms/step - loss: 1.3029 - accuracy: 0.5828 - val_loss: 0.7717 - val_accuracy: 0.7637 Epoch 6/30 566/566 [==============================] - 7s 12ms/step - loss: 1.2067 - accuracy: 0.6086 - val_loss: 0.7078 - val_accuracy: 0.7687 Epoch 7/30 566/566 [==============================] - 6s 11ms/step - loss: 1.1340 - accuracy: 0.6319 - val_loss: 0.6889 - val_accuracy: 0.7760 Epoch 8/30 566/566 [==============================] - 6s 10ms/step - loss: 1.0800 - accuracy: 0.6506 - val_loss: 0.5745 - val_accuracy: 0.8183 Epoch 9/30 566/566 [==============================] - 6s 11ms/step - loss: 1.0332 - accuracy: 0.6661 - val_loss: 0.5281 - val_accuracy: 0.8310 Epoch 10/30 566/566 [==============================] - 6s 10ms/step - loss: 0.9913 - accuracy: 0.6822 - val_loss: 0.5395 - val_accuracy: 0.8273 Epoch 11/30 566/566 [==============================] - 6s 11ms/step - loss: 0.9668 - accuracy: 0.6905 - val_loss: 0.4412 - val_accuracy: 0.8620 Epoch 12/30 566/566 [==============================] - 7s 11ms/step - loss: 0.9255 - accuracy: 0.6990 - val_loss: 0.4091 - val_accuracy: 0.8733 Epoch 13/30 566/566 [==============================] - 6s 11ms/step - loss: 0.9145 - accuracy: 0.7055 - val_loss: 0.4742 - val_accuracy: 0.8510 Epoch 14/30 566/566 [==============================] - 6s 11ms/step - loss: 0.8898 - accuracy: 0.7179 - val_loss: 0.3916 - val_accuracy: 0.8807 Epoch 15/30 566/566 [==============================] - 7s 12ms/step - loss: 0.8709 - accuracy: 0.7213 - val_loss: 0.4116 - val_accuracy: 0.8650 Epoch 16/30 566/566 [==============================] - 6s 10ms/step - loss: 0.8316 - accuracy: 0.7313 - val_loss: 0.4238 - val_accuracy: 0.8643 Epoch 17/30 566/566 [==============================] - 7s 13ms/step - loss: 0.8346 - accuracy: 0.7312 - val_loss: 0.3367 - val_accuracy: 0.8977 Epoch 18/30 566/566 [==============================] - 6s 10ms/step - loss: 0.8124 - accuracy: 0.7402 - val_loss: 0.3624 - val_accuracy: 0.8860 Epoch 19/30 566/566 [==============================] - 6s 10ms/step - loss: 0.8035 - accuracy: 0.7424 - val_loss: 0.3233 - val_accuracy: 0.9040 Epoch 20/30 566/566 [==============================] - 6s 10ms/step - loss: 0.7860 - accuracy: 0.7493 - val_loss: 0.3222 - val_accuracy: 0.9030 Epoch 21/30 566/566 [==============================] - 6s 11ms/step - loss: 0.7619 - accuracy: 0.7549 - val_loss: 0.3551 - val_accuracy: 0.8850 Epoch 22/30 566/566 [==============================] - 6s 10ms/step - loss: 0.7776 - accuracy: 0.7534 - val_loss: 0.3058 - val_accuracy: 0.9020 Epoch 23/30 566/566 [==============================] - 5s 9ms/step - loss: 0.7629 - accuracy: 0.7544 - val_loss: 0.3133 - val_accuracy: 0.9047 Epoch 24/30 566/566 [==============================] - 6s 11ms/step - loss: 0.7341 - accuracy: 0.7650 - val_loss: 0.3187 - val_accuracy: 0.9017 Epoch 25/30 566/566 [==============================] - 6s 10ms/step - loss: 0.7207 - accuracy: 0.7682 - val_loss: 0.3004 - val_accuracy: 0.9037 Epoch 26/30 566/566 [==============================] - 6s 10ms/step - loss: 0.7095 - accuracy: 0.7718 - val_loss: 0.3318 - val_accuracy: 0.8983 Epoch 27/30 566/566 [==============================] - 5s 9ms/step - loss: 0.7018 - accuracy: 0.7764 - val_loss: 0.3141 - val_accuracy: 0.9043 Epoch 28/30 566/566 [==============================] - 6s 10ms/step - loss: 0.7077 - accuracy: 0.7750 - val_loss: 0.2939 - val_accuracy: 0.9160 Epoch 29/30 566/566 [==============================] - 6s 10ms/step - loss: 0.6761 - accuracy: 0.7839 - val_loss: 0.2727 - val_accuracy: 0.9223 Epoch 30/30 566/566 [==============================] - 6s 11ms/step - loss: 0.6978 - accuracy: 0.7770 - val_loss: 0.2710 - val_accuracy: 0.9233
Loss on Test Dataset: 0.2784% Accuracy on Test Dataset: 91.77% CNN Error on Test Dataset: 8.23%
The test accuracy has increased but there is stronger overfitting.
# Create model
complex_model_31_3 = Sequential()
# Input layer
complex_model_31_3.add(Conv2D(32, (3, 3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_3.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_3.add(Dropout(0.2))
# Convolutional layer
complex_model_31_3.add(Conv2D(64, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3.add(MaxPooling2D(pool_size=(2, 2)))
# Batch Normalization
complex_model_31_3.add(BatchNormalization())
# Dropout layer
complex_model_31_3.add(Dropout(0.3))
# Convolutional layer
complex_model_31_3.add(Conv2D(128, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout Layer
complex_model_31_3.add(Dropout(0.2))
# Flatten Feature Map
complex_model_31_3.add(Flatten())
# Fully Connected Dense Layer
complex_model_31_3.add(Dense(128, activation='relu'))
# Batch Normalization
complex_model_31_3.add(BatchNormalization())
# Dropout layer
complex_model_31_3.add(Dropout(0.2))
# Fully Connected Dense Layer
complex_model_31_3.add(Dense(64, activation='relu'))
# Output layer
complex_model_31_3.add(Dense(num_classes, activation='softmax'))
# Compile complex_model_31_3
complex_model_31_3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# complex_model_31_3 summary
complex_model_31_3.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 29, 29, 32) 320
max_pooling2d (MaxPooling2D (None, 14, 14, 32) 0
)
dropout (Dropout) (None, 14, 14, 32) 0
conv2d_1 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_1 (MaxPooling (None, 6, 6, 64) 0
2D)
batch_normalization (BatchN (None, 6, 6, 64) 256
ormalization)
dropout_1 (Dropout) (None, 6, 6, 64) 0
conv2d_2 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_2 (MaxPooling (None, 2, 2, 128) 0
2D)
dropout_2 (Dropout) (None, 2, 2, 128) 0
flatten (Flatten) (None, 512) 0
dense (Dense) (None, 128) 65664
batch_normalization_1 (Batc (None, 128) 512
hNormalization)
dropout_3 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 64) 8256
dense_2 (Dense) (None, 15) 975
=================================================================
Total params: 168,335
Trainable params: 167,951
Non-trainable params: 384
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_3 = complex_model_31_3.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_3.save_weights('complex_model_31_3_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_3)
# Final evaluation of model
scores = complex_model_31_3.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 14s 13ms/step - loss: 2.1791 - accuracy: 0.2899 - val_loss: 2.9717 - val_accuracy: 0.2157 Epoch 2/30 566/566 [==============================] - 7s 13ms/step - loss: 1.6849 - accuracy: 0.4472 - val_loss: 1.3918 - val_accuracy: 0.5373 Epoch 3/30 566/566 [==============================] - 7s 13ms/step - loss: 1.4844 - accuracy: 0.5119 - val_loss: 1.2422 - val_accuracy: 0.5883 Epoch 4/30 566/566 [==============================] - 7s 13ms/step - loss: 1.3344 - accuracy: 0.5643 - val_loss: 0.9604 - val_accuracy: 0.6837 Epoch 5/30 566/566 [==============================] - 7s 13ms/step - loss: 1.2280 - accuracy: 0.5969 - val_loss: 0.8229 - val_accuracy: 0.7273 Epoch 6/30 566/566 [==============================] - 7s 13ms/step - loss: 1.1329 - accuracy: 0.6314 - val_loss: 0.7799 - val_accuracy: 0.7477 Epoch 7/30 566/566 [==============================] - 8s 13ms/step - loss: 1.0651 - accuracy: 0.6561 - val_loss: 0.5901 - val_accuracy: 0.8153 Epoch 8/30 566/566 [==============================] - 7s 13ms/step - loss: 0.9907 - accuracy: 0.6798 - val_loss: 0.7002 - val_accuracy: 0.7657 Epoch 9/30 566/566 [==============================] - 7s 13ms/step - loss: 0.9565 - accuracy: 0.6905 - val_loss: 0.5395 - val_accuracy: 0.8243 Epoch 10/30 566/566 [==============================] - 7s 13ms/step - loss: 0.8938 - accuracy: 0.7080 - val_loss: 0.4794 - val_accuracy: 0.8443 Epoch 11/30 566/566 [==============================] - 7s 13ms/step - loss: 0.8617 - accuracy: 0.7255 - val_loss: 0.5898 - val_accuracy: 0.8107 Epoch 12/30 566/566 [==============================] - 7s 12ms/step - loss: 0.8079 - accuracy: 0.7389 - val_loss: 0.4561 - val_accuracy: 0.8520 Epoch 13/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7955 - accuracy: 0.7437 - val_loss: 0.3898 - val_accuracy: 0.8803 Epoch 14/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7570 - accuracy: 0.7529 - val_loss: 0.4102 - val_accuracy: 0.8737 Epoch 15/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7503 - accuracy: 0.7600 - val_loss: 0.7941 - val_accuracy: 0.7477 Epoch 16/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7540 - accuracy: 0.7601 - val_loss: 0.3361 - val_accuracy: 0.8960 Epoch 17/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7140 - accuracy: 0.7707 - val_loss: 0.3076 - val_accuracy: 0.9043 Epoch 18/30 566/566 [==============================] - 7s 12ms/step - loss: 0.7034 - accuracy: 0.7709 - val_loss: 0.3306 - val_accuracy: 0.8950 Epoch 19/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6687 - accuracy: 0.7839 - val_loss: 0.3764 - val_accuracy: 0.8817 Epoch 20/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6642 - accuracy: 0.7849 - val_loss: 0.4239 - val_accuracy: 0.8660 Epoch 21/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6491 - accuracy: 0.7927 - val_loss: 0.6604 - val_accuracy: 0.7930 Epoch 22/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6391 - accuracy: 0.7912 - val_loss: 0.3794 - val_accuracy: 0.8800 Epoch 23/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6212 - accuracy: 0.8015 - val_loss: 0.3379 - val_accuracy: 0.8883 Epoch 24/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6129 - accuracy: 0.8014 - val_loss: 0.3410 - val_accuracy: 0.8930 Epoch 25/30 566/566 [==============================] - 7s 12ms/step - loss: 0.6032 - accuracy: 0.8044 - val_loss: 0.2664 - val_accuracy: 0.9190 Epoch 26/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5936 - accuracy: 0.8084 - val_loss: 0.2570 - val_accuracy: 0.9213 Epoch 27/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5706 - accuracy: 0.8155 - val_loss: 0.2691 - val_accuracy: 0.9193 Epoch 28/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5699 - accuracy: 0.8155 - val_loss: 0.2420 - val_accuracy: 0.9290 Epoch 29/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5485 - accuracy: 0.8238 - val_loss: 0.2248 - val_accuracy: 0.9277 Epoch 30/30 566/566 [==============================] - 7s 12ms/step - loss: 0.5603 - accuracy: 0.8164 - val_loss: 0.2456 - val_accuracy: 0.9213
Loss on Test Dataset: 0.2409% Accuracy on Test Dataset: 92.10% CNN Error on Test Dataset: 7.90%
Insights:
As I did modelling for the input size 31 x 31, I realised that the model tends to overfit more than for input size 128 x 128.
5. Model Improvement
1. Layer Kernel Regularizers
Regularizers allow you to apply penalties on layer parameters or layer activity during optimization. These penalties are summed into the loss function that the network optimizes.
I will be applying the kernel_regularizer to penalize the weights which are very large causing the network to overfit.
The available regularizers are L1 class & L2 class. The difference between the two is in how they are calculated.
L1 regularization penalty: loss = l1 * reduce_sum(abs(x))
L2 regularization penalty: loss = l2 * reduce_sum(square(x))
Source:
https://keras.io/api/layers/regularizers/
https://www.tensorflow.org/api_docs/python/tf/keras/regularizers
https://www.analyticsvidhya.com/blog/2018/04/fundamentals-deep-learning-regularization-techniques/
A. Input size: 128 by 128 pixels
1. L1 regularization
complex_model_128_3_l1 = Sequential()
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(32, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l1.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(64, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l1.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(128, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l1.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l1.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_3_l1.add(Flatten())
# Fully connected layer
complex_model_128_3_l1.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_3_l1.add(BatchNormalization())
# Dropout layer
complex_model_128_3_l1.add(Dropout(0.4))
# Outout layer
complex_model_128_3_l1.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
# Compile basedline_model_128
complex_model_128_3_l1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_3_l1.summary()
Model: "sequential_30"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_92 (Conv2D) (None, 126, 126, 32) 320
conv2d_93 (Conv2D) (None, 124, 124, 32) 9248
max_pooling2d_65 (MaxPoolin (None, 62, 62, 32) 0
g2D)
dropout_69 (Dropout) (None, 62, 62, 32) 0
conv2d_94 (Conv2D) (None, 60, 60, 64) 18496
conv2d_95 (Conv2D) (None, 58, 58, 64) 36928
max_pooling2d_66 (MaxPoolin (None, 29, 29, 64) 0
g2D)
dropout_70 (Dropout) (None, 29, 29, 64) 0
conv2d_96 (Conv2D) (None, 27, 27, 128) 73856
conv2d_97 (Conv2D) (None, 25, 25, 128) 147584
max_pooling2d_67 (MaxPoolin (None, 12, 12, 128) 0
g2D)
dropout_71 (Dropout) (None, 12, 12, 128) 0
flatten_24 (Flatten) (None, 18432) 0
dense_55 (Dense) (None, 128) 2359424
batch_normalization_24 (Bat (None, 128) 512
chNormalization)
dropout_72 (Dropout) (None, 128) 0
dense_56 (Dense) (None, 15) 1935
=================================================================
Total params: 2,648,303
Trainable params: 2,648,047
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_3_l1 = complex_model_128_3_l1.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_3_l1.save_weights('complex_model_128_3_l1_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_3_l1)
# Final evaluation of model
scores = complex_model_128_3_l1.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 38s 67ms/step - loss: 2.9554 - accuracy: 0.2234 - val_loss: 2.4250 - val_accuracy: 0.2603 Epoch 2/30 566/566 [==============================] - 38s 66ms/step - loss: 2.1265 - accuracy: 0.3759 - val_loss: 2.2518 - val_accuracy: 0.3170 Epoch 3/30 566/566 [==============================] - 37s 65ms/step - loss: 1.8380 - accuracy: 0.4925 - val_loss: 1.8126 - val_accuracy: 0.5160 Epoch 4/30 566/566 [==============================] - 38s 66ms/step - loss: 1.6792 - accuracy: 0.5488 - val_loss: 1.4784 - val_accuracy: 0.6370 Epoch 5/30 566/566 [==============================] - 38s 67ms/step - loss: 1.4476 - accuracy: 0.6268 - val_loss: 1.8600 - val_accuracy: 0.4607 Epoch 6/30 566/566 [==============================] - 38s 67ms/step - loss: 1.3080 - accuracy: 0.6751 - val_loss: 1.0876 - val_accuracy: 0.7653 Epoch 7/30 566/566 [==============================] - 37s 65ms/step - loss: 1.1425 - accuracy: 0.7295 - val_loss: 0.8673 - val_accuracy: 0.8287 Epoch 8/30 566/566 [==============================] - 38s 67ms/step - loss: 1.0336 - accuracy: 0.7689 - val_loss: 0.7760 - val_accuracy: 0.8597 Epoch 9/30 566/566 [==============================] - 34s 60ms/step - loss: 1.5829 - accuracy: 0.5729 - val_loss: 1.1950 - val_accuracy: 0.7177 Epoch 10/30 566/566 [==============================] - 35s 62ms/step - loss: 1.0385 - accuracy: 0.7585 - val_loss: 0.8322 - val_accuracy: 0.8370 Epoch 11/30 566/566 [==============================] - 37s 65ms/step - loss: 0.8643 - accuracy: 0.8173 - val_loss: 0.6489 - val_accuracy: 0.8853 Epoch 12/30 566/566 [==============================] - 38s 66ms/step - loss: 0.8272 - accuracy: 0.8258 - val_loss: 0.6553 - val_accuracy: 0.8793 Epoch 13/30 566/566 [==============================] - 38s 66ms/step - loss: 0.9418 - accuracy: 0.7880 - val_loss: 0.7749 - val_accuracy: 0.8477 Epoch 14/30 566/566 [==============================] - 38s 67ms/step - loss: 0.7785 - accuracy: 0.8391 - val_loss: 0.6746 - val_accuracy: 0.8673 Epoch 15/30 566/566 [==============================] - 37s 65ms/step - loss: 0.7262 - accuracy: 0.8536 - val_loss: 0.5512 - val_accuracy: 0.9077 Epoch 16/30 566/566 [==============================] - 38s 66ms/step - loss: 0.6796 - accuracy: 0.8726 - val_loss: 0.4571 - val_accuracy: 0.9367 Epoch 17/30 566/566 [==============================] - 38s 66ms/step - loss: 0.6516 - accuracy: 0.8747 - val_loss: 0.4726 - val_accuracy: 0.9257 Epoch 18/30 566/566 [==============================] - 38s 67ms/step - loss: 0.6113 - accuracy: 0.8901 - val_loss: 0.4425 - val_accuracy: 0.9433 Epoch 19/30 566/566 [==============================] - 38s 66ms/step - loss: 0.6044 - accuracy: 0.8878 - val_loss: 0.4209 - val_accuracy: 0.9433 Epoch 20/30 566/566 [==============================] - 37s 65ms/step - loss: 0.5853 - accuracy: 0.8953 - val_loss: 0.5284 - val_accuracy: 0.9050 Epoch 21/30 566/566 [==============================] - 38s 66ms/step - loss: 0.5491 - accuracy: 0.9039 - val_loss: 0.4142 - val_accuracy: 0.9423 Epoch 22/30 566/566 [==============================] - 38s 66ms/step - loss: 0.9208 - accuracy: 0.7875 - val_loss: 0.6737 - val_accuracy: 0.8667 Epoch 23/30 566/566 [==============================] - 38s 67ms/step - loss: 0.5647 - accuracy: 0.8968 - val_loss: 0.3485 - val_accuracy: 0.9610 Epoch 24/30 566/566 [==============================] - 37s 64ms/step - loss: 0.4974 - accuracy: 0.9215 - val_loss: 0.3255 - val_accuracy: 0.9650 Epoch 25/30 566/566 [==============================] - 38s 66ms/step - loss: 0.6051 - accuracy: 0.8859 - val_loss: 0.3765 - val_accuracy: 0.9490 Epoch 26/30 566/566 [==============================] - 38s 66ms/step - loss: 0.4625 - accuracy: 0.9294 - val_loss: 0.3246 - val_accuracy: 0.9680 Epoch 27/30 566/566 [==============================] - 38s 66ms/step - loss: 0.4522 - accuracy: 0.9299 - val_loss: 0.3002 - val_accuracy: 0.9737 Epoch 28/30 566/566 [==============================] - 34s 60ms/step - loss: 0.4466 - accuracy: 0.9327 - val_loss: 0.3350 - val_accuracy: 0.9593 Epoch 29/30 566/566 [==============================] - 34s 59ms/step - loss: 0.4416 - accuracy: 0.9316 - val_loss: 0.3093 - val_accuracy: 0.9643 Epoch 30/30 566/566 [==============================] - 34s 59ms/step - loss: 0.4226 - accuracy: 0.9382 - val_loss: 0.3183 - val_accuracy: 0.9663
Loss on Test Dataset: 0.3160% Accuracy on Test Dataset: 96.67% CNN Error on Test Dataset: 3.33%
Insights:
- By adding in L1 regularisers, the model is less overfitted.
2. L2 regularization
complex_model_128_3_l2 = Sequential()
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(32, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(64, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(128, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_3_l2.add(Flatten())
# Fully connected layer
complex_model_128_3_l2.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_3_l2.add(BatchNormalization())
# Dropout layer
complex_model_128_3_l2.add(Dropout(0.4))
# Outout layer
complex_model_128_3_l2.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
# Compile basedline_model_128
complex_model_128_3_l2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_3_l2.summary()
Model: "sequential_31"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_98 (Conv2D) (None, 126, 126, 32) 320
conv2d_99 (Conv2D) (None, 124, 124, 32) 9248
max_pooling2d_68 (MaxPoolin (None, 62, 62, 32) 0
g2D)
dropout_73 (Dropout) (None, 62, 62, 32) 0
conv2d_100 (Conv2D) (None, 60, 60, 64) 18496
conv2d_101 (Conv2D) (None, 58, 58, 64) 36928
max_pooling2d_69 (MaxPoolin (None, 29, 29, 64) 0
g2D)
dropout_74 (Dropout) (None, 29, 29, 64) 0
conv2d_102 (Conv2D) (None, 27, 27, 128) 73856
conv2d_103 (Conv2D) (None, 25, 25, 128) 147584
max_pooling2d_70 (MaxPoolin (None, 12, 12, 128) 0
g2D)
dropout_75 (Dropout) (None, 12, 12, 128) 0
flatten_25 (Flatten) (None, 18432) 0
dense_57 (Dense) (None, 128) 2359424
batch_normalization_25 (Bat (None, 128) 512
chNormalization)
dropout_76 (Dropout) (None, 128) 0
dense_58 (Dense) (None, 15) 1935
=================================================================
Total params: 2,648,303
Trainable params: 2,648,047
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_3_l2 = complex_model_128_3_l2.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_3_l2.save_weights('complex_model_128_3_l2_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_3_l2)
# Final evaluation of model
scores = complex_model_128_3_l2.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 34s 59ms/step - loss: 2.4532 - accuracy: 0.2341 - val_loss: 2.1542 - val_accuracy: 0.2917 Epoch 2/30 566/566 [==============================] - 34s 59ms/step - loss: 1.8221 - accuracy: 0.4248 - val_loss: 2.5700 - val_accuracy: 0.2130 Epoch 3/30 566/566 [==============================] - 34s 59ms/step - loss: 1.5534 - accuracy: 0.5161 - val_loss: 1.6362 - val_accuracy: 0.4533 Epoch 4/30 566/566 [==============================] - 32s 56ms/step - loss: 1.2836 - accuracy: 0.6120 - val_loss: 0.9652 - val_accuracy: 0.7353 Epoch 5/30 566/566 [==============================] - 34s 59ms/step - loss: 1.1120 - accuracy: 0.6668 - val_loss: 1.0940 - val_accuracy: 0.6570 Epoch 6/30 566/566 [==============================] - 34s 59ms/step - loss: 0.9866 - accuracy: 0.7127 - val_loss: 0.8626 - val_accuracy: 0.7433 Epoch 7/30 566/566 [==============================] - 34s 59ms/step - loss: 0.9248 - accuracy: 0.7328 - val_loss: 0.6962 - val_accuracy: 0.7970 Epoch 8/30 566/566 [==============================] - 32s 57ms/step - loss: 0.8042 - accuracy: 0.7705 - val_loss: 0.5729 - val_accuracy: 0.8510 Epoch 9/30 566/566 [==============================] - 34s 59ms/step - loss: 0.7220 - accuracy: 0.7993 - val_loss: 0.5561 - val_accuracy: 0.8570 Epoch 10/30 566/566 [==============================] - 34s 60ms/step - loss: 0.6450 - accuracy: 0.8237 - val_loss: 0.4689 - val_accuracy: 0.8810 Epoch 11/30 566/566 [==============================] - 34s 59ms/step - loss: 0.6031 - accuracy: 0.8377 - val_loss: 0.4873 - val_accuracy: 0.8717 Epoch 12/30 566/566 [==============================] - 33s 58ms/step - loss: 0.5512 - accuracy: 0.8548 - val_loss: 0.3776 - val_accuracy: 0.9097 Epoch 13/30 566/566 [==============================] - 34s 59ms/step - loss: 0.4926 - accuracy: 0.8694 - val_loss: 0.3598 - val_accuracy: 0.9160 Epoch 14/30 566/566 [==============================] - 34s 60ms/step - loss: 0.4923 - accuracy: 0.8706 - val_loss: 0.4936 - val_accuracy: 0.8653 Epoch 15/30 566/566 [==============================] - 34s 60ms/step - loss: 0.5078 - accuracy: 0.8676 - val_loss: 0.2714 - val_accuracy: 0.9393 Epoch 16/30 566/566 [==============================] - 34s 59ms/step - loss: 0.4003 - accuracy: 0.8993 - val_loss: 0.3029 - val_accuracy: 0.9240 Epoch 17/30 566/566 [==============================] - 33s 58ms/step - loss: 0.3856 - accuracy: 0.9031 - val_loss: 0.2563 - val_accuracy: 0.9410 Epoch 18/30 566/566 [==============================] - 34s 60ms/step - loss: 0.4022 - accuracy: 0.8975 - val_loss: 0.2805 - val_accuracy: 0.9343 Epoch 19/30 566/566 [==============================] - 34s 60ms/step - loss: 0.3247 - accuracy: 0.9209 - val_loss: 0.2467 - val_accuracy: 0.9483 Epoch 20/30 566/566 [==============================] - 34s 60ms/step - loss: 0.3053 - accuracy: 0.9289 - val_loss: 0.2068 - val_accuracy: 0.9593 Epoch 21/30 566/566 [==============================] - 33s 57ms/step - loss: 0.2977 - accuracy: 0.9287 - val_loss: 0.2053 - val_accuracy: 0.9613 Epoch 22/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2905 - accuracy: 0.9302 - val_loss: 0.2208 - val_accuracy: 0.9510 Epoch 23/30 566/566 [==============================] - 34s 60ms/step - loss: 0.2651 - accuracy: 0.9361 - val_loss: 0.1931 - val_accuracy: 0.9623 Epoch 24/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2628 - accuracy: 0.9386 - val_loss: 0.2246 - val_accuracy: 0.9510 Epoch 25/30 566/566 [==============================] - 33s 57ms/step - loss: 0.2488 - accuracy: 0.9420 - val_loss: 0.2104 - val_accuracy: 0.9567 Epoch 26/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2447 - accuracy: 0.9440 - val_loss: 0.1959 - val_accuracy: 0.9613 Epoch 27/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2542 - accuracy: 0.9374 - val_loss: 0.2170 - val_accuracy: 0.9547 Epoch 28/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2401 - accuracy: 0.9443 - val_loss: 0.1911 - val_accuracy: 0.9600 Epoch 29/30 566/566 [==============================] - 32s 57ms/step - loss: 0.2206 - accuracy: 0.9471 - val_loss: 0.2735 - val_accuracy: 0.9313 Epoch 30/30 566/566 [==============================] - 34s 59ms/step - loss: 0.2104 - accuracy: 0.9514 - val_loss: 0.1582 - val_accuracy: 0.9730
Loss on Test Dataset: 0.1422% Accuracy on Test Dataset: 97.30% CNN Error on Test Dataset: 2.70%
Insights:
- Model is less overfitted and test accuracy has increased.
Hence, I will choose L2 regularizer.
B. Input size: 31 by 31 pixels
1. L1 regularization
# Create model
complex_model_31_3_l1 = Sequential()
# Input layer
complex_model_31_3_l1.add(Conv2D(32, (3, 3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_3_l1.add(Dropout(0.2))
# Convolutional layer
complex_model_31_3_l1.add(Conv2D(64, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Batch Normalization
complex_model_31_3_l1.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1.add(Dropout(0.3))
# Convolutional layer
complex_model_31_3_l1.add(Conv2D(128, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout Layer
complex_model_31_3_l1.add(Dropout(0.2))
# Flatten Feature Map
complex_model_31_3_l1.add(Flatten())
# Fully Connected Dense Layer
complex_model_31_3_l1.add(Dense(128, activation='relu'))
# Batch Normalization
complex_model_31_3_l1.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1.add(Dropout(0.2))
# Fully Connected Dense Layer
complex_model_31_3_l1.add(Dense(64, activation='relu'))
# Output layer
complex_model_31_3_l1.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
# Compile complex_model_31_3_l1
complex_model_31_3_l1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# complex_model_31_3_l1 summary
complex_model_31_3_l1.summary()
Model: "sequential_33"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_107 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_74 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_81 (Dropout) (None, 14, 14, 32) 0
conv2d_108 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_75 (MaxPoolin (None, 6, 6, 64) 0
g2D)
batch_normalization_28 (Bat (None, 6, 6, 64) 256
chNormalization)
dropout_82 (Dropout) (None, 6, 6, 64) 0
conv2d_109 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_76 (MaxPoolin (None, 2, 2, 128) 0
g2D)
dropout_83 (Dropout) (None, 2, 2, 128) 0
flatten_27 (Flatten) (None, 512) 0
dense_62 (Dense) (None, 128) 65664
batch_normalization_29 (Bat (None, 128) 512
chNormalization)
dropout_84 (Dropout) (None, 128) 0
dense_63 (Dense) (None, 64) 8256
dense_64 (Dense) (None, 15) 975
=================================================================
Total params: 168,335
Trainable params: 167,951
Non-trainable params: 384
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_3_l1 = complex_model_31_3_l1.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_3_l1.save_weights('complex_model_31_3_l1_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_3_l1)
# Final evaluation of model
scores = complex_model_31_3_l1.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 10s 16ms/step - loss: 2.9332 - accuracy: 0.2918 - val_loss: 3.3141 - val_accuracy: 0.1567 Epoch 2/30 566/566 [==============================] - 9s 16ms/step - loss: 2.1039 - accuracy: 0.3980 - val_loss: 2.2038 - val_accuracy: 0.3590 Epoch 3/30 566/566 [==============================] - 9s 16ms/step - loss: 1.8451 - accuracy: 0.4581 - val_loss: 1.9010 - val_accuracy: 0.4300 Epoch 4/30 566/566 [==============================] - 9s 16ms/step - loss: 1.7012 - accuracy: 0.5045 - val_loss: 1.5159 - val_accuracy: 0.5573 Epoch 5/30 566/566 [==============================] - 9s 15ms/step - loss: 1.5915 - accuracy: 0.5349 - val_loss: 1.2748 - val_accuracy: 0.6287 Epoch 6/30 566/566 [==============================] - 9s 16ms/step - loss: 1.4799 - accuracy: 0.5689 - val_loss: 1.2841 - val_accuracy: 0.6210 Epoch 7/30 566/566 [==============================] - 9s 16ms/step - loss: 1.3932 - accuracy: 0.5908 - val_loss: 1.6099 - val_accuracy: 0.5303 Epoch 8/30 566/566 [==============================] - 9s 16ms/step - loss: 1.3380 - accuracy: 0.6162 - val_loss: 1.3202 - val_accuracy: 0.6127 Epoch 9/30 566/566 [==============================] - 9s 16ms/step - loss: 1.2521 - accuracy: 0.6374 - val_loss: 0.8943 - val_accuracy: 0.7540 Epoch 10/30 566/566 [==============================] - 9s 16ms/step - loss: 1.1986 - accuracy: 0.6609 - val_loss: 0.7916 - val_accuracy: 0.7900 Epoch 11/30 566/566 [==============================] - 9s 16ms/step - loss: 1.1588 - accuracy: 0.6715 - val_loss: 0.7670 - val_accuracy: 0.7947 Epoch 12/30 566/566 [==============================] - 9s 16ms/step - loss: 1.1125 - accuracy: 0.6822 - val_loss: 0.9724 - val_accuracy: 0.7187 Epoch 13/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0683 - accuracy: 0.6953 - val_loss: 0.7139 - val_accuracy: 0.8097 Epoch 14/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0281 - accuracy: 0.7029 - val_loss: 0.5765 - val_accuracy: 0.8583 Epoch 15/30 566/566 [==============================] - 9s 15ms/step - loss: 1.0113 - accuracy: 0.7131 - val_loss: 0.7912 - val_accuracy: 0.7733 Epoch 16/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9772 - accuracy: 0.7230 - val_loss: 0.6989 - val_accuracy: 0.8087 Epoch 17/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9715 - accuracy: 0.7254 - val_loss: 0.8963 - val_accuracy: 0.7460 Epoch 18/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9444 - accuracy: 0.7334 - val_loss: 0.5721 - val_accuracy: 0.8523 Epoch 19/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9161 - accuracy: 0.7426 - val_loss: 0.5594 - val_accuracy: 0.8580 Epoch 20/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9003 - accuracy: 0.7442 - val_loss: 0.6018 - val_accuracy: 0.8363 Epoch 21/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8853 - accuracy: 0.7510 - val_loss: 0.4626 - val_accuracy: 0.8940 Epoch 22/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8658 - accuracy: 0.7516 - val_loss: 0.5277 - val_accuracy: 0.8600 Epoch 23/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8571 - accuracy: 0.7569 - val_loss: 0.5411 - val_accuracy: 0.8610 Epoch 24/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8584 - accuracy: 0.7579 - val_loss: 0.4522 - val_accuracy: 0.8907 Epoch 25/30 566/566 [==============================] - 9s 15ms/step - loss: 0.8372 - accuracy: 0.7617 - val_loss: 0.4074 - val_accuracy: 0.9027 Epoch 26/30 566/566 [==============================] - 9s 17ms/step - loss: 0.8237 - accuracy: 0.7672 - val_loss: 0.5089 - val_accuracy: 0.8650 Epoch 27/30 566/566 [==============================] - 10s 17ms/step - loss: 0.8476 - accuracy: 0.7599 - val_loss: 0.4051 - val_accuracy: 0.9037 Epoch 28/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8131 - accuracy: 0.7672 - val_loss: 0.3977 - val_accuracy: 0.9020 Epoch 29/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7898 - accuracy: 0.7778 - val_loss: 0.4371 - val_accuracy: 0.8927 Epoch 30/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7906 - accuracy: 0.7753 - val_loss: 0.3773 - val_accuracy: 0.9143
Loss on Test Dataset: 0.3674% Accuracy on Test Dataset: 92.47% CNN Error on Test Dataset: 7.53%
2. L2 regularization
# Create model
complex_model_31_3_l2 = Sequential()
# Input layer
complex_model_31_3_l2.add(Conv2D(32, (3, 3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_3_l2.add(Dropout(0.2))
# Convolutional layer
complex_model_31_3_l2.add(Conv2D(64, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Batch Normalization
complex_model_31_3_l2.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l2.add(Dropout(0.3))
# Convolutional layer
complex_model_31_3_l2.add(Conv2D(128, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l2.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout Layer
complex_model_31_3_l2.add(Dropout(0.2))
# Flatten Feature Map
complex_model_31_3_l2.add(Flatten())
# Fully Connected Dense Layer
complex_model_31_3_l2.add(Dense(128, activation='relu'))
# Batch Normalization
complex_model_31_3_l2.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l2.add(Dropout(0.2))
# Fully Connected Dense Layer
complex_model_31_3_l2.add(Dense(64, activation='relu'))
# Output layer
complex_model_31_3_l2.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
# Compile complex_model_31_3_l2
complex_model_31_3_l2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# complex_model_31_3_l2 summary
complex_model_31_3_l2.summary()
Model: "sequential_32"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_104 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_71 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_77 (Dropout) (None, 14, 14, 32) 0
conv2d_105 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_72 (MaxPoolin (None, 6, 6, 64) 0
g2D)
batch_normalization_26 (Bat (None, 6, 6, 64) 256
chNormalization)
dropout_78 (Dropout) (None, 6, 6, 64) 0
conv2d_106 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_73 (MaxPoolin (None, 2, 2, 128) 0
g2D)
dropout_79 (Dropout) (None, 2, 2, 128) 0
flatten_26 (Flatten) (None, 512) 0
dense_59 (Dense) (None, 128) 65664
batch_normalization_27 (Bat (None, 128) 512
chNormalization)
dropout_80 (Dropout) (None, 128) 0
dense_60 (Dense) (None, 64) 8256
dense_61 (Dense) (None, 15) 975
=================================================================
Total params: 168,335
Trainable params: 167,951
Non-trainable params: 384
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_3_l2 = complex_model_31_3_l2.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_3_l2.save_weights('complex_model_31_3_l2_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_3_l2)
# Final evaluation of model
scores = complex_model_31_3_l2.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 10s 16ms/step - loss: 2.3583 - accuracy: 0.2846 - val_loss: 2.9424 - val_accuracy: 0.2113 Epoch 2/30 566/566 [==============================] - 9s 16ms/step - loss: 1.8857 - accuracy: 0.4104 - val_loss: 1.5416 - val_accuracy: 0.5133 Epoch 3/30 566/566 [==============================] - 9s 16ms/step - loss: 1.6694 - accuracy: 0.4746 - val_loss: 1.4822 - val_accuracy: 0.5220 Epoch 4/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5414 - accuracy: 0.5172 - val_loss: 2.1690 - val_accuracy: 0.3850 Epoch 5/30 566/566 [==============================] - 9s 16ms/step - loss: 1.4263 - accuracy: 0.5525 - val_loss: 1.0845 - val_accuracy: 0.6750 Epoch 6/30 566/566 [==============================] - 9s 16ms/step - loss: 1.3336 - accuracy: 0.5869 - val_loss: 1.2165 - val_accuracy: 0.6247 Epoch 7/30 566/566 [==============================] - 9s 16ms/step - loss: 1.2657 - accuracy: 0.6061 - val_loss: 0.9439 - val_accuracy: 0.7107 Epoch 8/30 566/566 [==============================] - 9s 16ms/step - loss: 1.2057 - accuracy: 0.6278 - val_loss: 0.7659 - val_accuracy: 0.7733 Epoch 9/30 566/566 [==============================] - 9s 16ms/step - loss: 1.1359 - accuracy: 0.6465 - val_loss: 0.7663 - val_accuracy: 0.7603 Epoch 10/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0974 - accuracy: 0.6593 - val_loss: 0.7561 - val_accuracy: 0.7730 Epoch 11/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0579 - accuracy: 0.6749 - val_loss: 0.7101 - val_accuracy: 0.7870 Epoch 12/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0186 - accuracy: 0.6869 - val_loss: 0.6386 - val_accuracy: 0.8097 Epoch 13/30 566/566 [==============================] - 9s 15ms/step - loss: 0.9854 - accuracy: 0.6959 - val_loss: 1.8578 - val_accuracy: 0.5043 Epoch 14/30 566/566 [==============================] - 9s 16ms/step - loss: 1.0521 - accuracy: 0.6796 - val_loss: 0.6314 - val_accuracy: 0.8093 Epoch 15/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9400 - accuracy: 0.7084 - val_loss: 0.7179 - val_accuracy: 0.7897 Epoch 16/30 566/566 [==============================] - 9s 16ms/step - loss: 0.9024 - accuracy: 0.7200 - val_loss: 0.5539 - val_accuracy: 0.8387 Epoch 17/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8839 - accuracy: 0.7261 - val_loss: 0.6702 - val_accuracy: 0.8043 Epoch 18/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8645 - accuracy: 0.7321 - val_loss: 0.5386 - val_accuracy: 0.8417 Epoch 19/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8651 - accuracy: 0.7331 - val_loss: 0.6314 - val_accuracy: 0.8077 Epoch 20/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8523 - accuracy: 0.7383 - val_loss: 1.4044 - val_accuracy: 0.5997 Epoch 21/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8927 - accuracy: 0.7242 - val_loss: 0.5632 - val_accuracy: 0.8257 Epoch 22/30 566/566 [==============================] - 9s 16ms/step - loss: 0.8497 - accuracy: 0.7378 - val_loss: 0.5962 - val_accuracy: 0.8140 Epoch 23/30 566/566 [==============================] - 9s 15ms/step - loss: 0.8149 - accuracy: 0.7487 - val_loss: 0.4371 - val_accuracy: 0.8773 Epoch 24/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7800 - accuracy: 0.7590 - val_loss: 0.5243 - val_accuracy: 0.8397 Epoch 25/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7774 - accuracy: 0.7617 - val_loss: 0.4569 - val_accuracy: 0.8673 Epoch 26/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7967 - accuracy: 0.7525 - val_loss: 0.5340 - val_accuracy: 0.8390 Epoch 27/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7640 - accuracy: 0.7636 - val_loss: 0.4084 - val_accuracy: 0.8873 Epoch 28/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7560 - accuracy: 0.7685 - val_loss: 0.3675 - val_accuracy: 0.8990 Epoch 29/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7386 - accuracy: 0.7746 - val_loss: 0.3562 - val_accuracy: 0.8997 Epoch 30/30 566/566 [==============================] - 9s 16ms/step - loss: 0.7299 - accuracy: 0.7763 - val_loss: 0.4843 - val_accuracy: 0.8583
Loss on Test Dataset: 0.4393% Accuracy on Test Dataset: 87.23% CNN Error on Test Dataset: 12.77%
Hence, I will choose L1 regaluarizer.
2. Optimizers
- Adam (Adaptive Moment Estimation)
- Maintains two moving averages for each parameter: the first moment (mean) and the second moment (uncentered variance)
- Adam adapts the learning rates for each parameter based on their historical gradients, making it well-suited for a variety of optimization problems
- SGD (Stochastic Gradient Descent)
- Updates the model parameters in the opposite direction of the gradient of the loss with respect to the parameters
- Can suffer from slow convergence, especially in the presence of sparse or noisy gradients
- RMSprop (Root Mean Square Propagation)
- Focuses on accelerating the optimization process by decreasing the number of function evaluations to reach the local minimum
- Keeps the moving average of squared gradients for every weight and divides the gradient by the square root of the mean square
A. Input size: 128 by 128 pixels
1. SGD
complex_model_128_3_l2_SGD = Sequential()
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(32, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_SGD.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(64, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_SGD.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(128, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_SGD.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_SGD.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_3_l2_SGD.add(Flatten())
# Fully connected layer
complex_model_128_3_l2_SGD.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_3_l2_SGD.add(BatchNormalization())
# Dropout layer
complex_model_128_3_l2_SGD.add(Dropout(0.4))
# Outout layer
complex_model_128_3_l2_SGD.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
# Compile basedline_model_128
complex_model_128_3_l2_SGD.compile(loss='categorical_crossentropy', optimizer='SGD', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_3_l2_SGD.summary()
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_32 (Conv2D) (None, 126, 126, 32) 320
conv2d_33 (Conv2D) (None, 124, 124, 32) 9248
max_pooling2d_17 (MaxPoolin (None, 62, 62, 32) 0
g2D)
dropout_21 (Dropout) (None, 62, 62, 32) 0
conv2d_34 (Conv2D) (None, 60, 60, 64) 18496
conv2d_35 (Conv2D) (None, 58, 58, 64) 36928
max_pooling2d_18 (MaxPoolin (None, 29, 29, 64) 0
g2D)
dropout_22 (Dropout) (None, 29, 29, 64) 0
conv2d_36 (Conv2D) (None, 27, 27, 128) 73856
conv2d_37 (Conv2D) (None, 25, 25, 128) 147584
max_pooling2d_19 (MaxPoolin (None, 12, 12, 128) 0
g2D)
dropout_23 (Dropout) (None, 12, 12, 128) 0
flatten_6 (Flatten) (None, 18432) 0
dense_12 (Dense) (None, 128) 2359424
batch_normalization_5 (Batc (None, 128) 512
hNormalization)
dropout_24 (Dropout) (None, 128) 0
dense_13 (Dense) (None, 15) 1935
=================================================================
Total params: 2,648,303
Trainable params: 2,648,047
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_3_l2_SGD = complex_model_128_3_l2_SGD.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_3_l2_SGD.save_weights('complex_model_128_3_l2_SGD_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_3_l2_SGD)
# Final evaluation of model
scores = complex_model_128_3_l2_SGD.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 45s 76ms/step - loss: 2.8847 - accuracy: 0.1160 - val_loss: 3.0840 - val_accuracy: 0.0667 Epoch 2/30 566/566 [==============================] - 45s 79ms/step - loss: 2.5944 - accuracy: 0.2242 - val_loss: 2.9150 - val_accuracy: 0.1540 Epoch 3/30 566/566 [==============================] - 46s 82ms/step - loss: 2.4015 - accuracy: 0.2791 - val_loss: 3.3412 - val_accuracy: 0.1527 Epoch 4/30 566/566 [==============================] - 62s 109ms/step - loss: 2.2418 - accuracy: 0.3240 - val_loss: 2.9560 - val_accuracy: 0.1753 Epoch 5/30 566/566 [==============================] - 53s 92ms/step - loss: 2.0427 - accuracy: 0.3776 - val_loss: 2.3544 - val_accuracy: 0.2707 Epoch 6/30 566/566 [==============================] - 55s 97ms/step - loss: 1.8702 - accuracy: 0.4377 - val_loss: 2.9764 - val_accuracy: 0.1540 Epoch 7/30 566/566 [==============================] - 52s 93ms/step - loss: 1.7281 - accuracy: 0.4770 - val_loss: 1.6205 - val_accuracy: 0.4957 Epoch 8/30 566/566 [==============================] - 60s 107ms/step - loss: 1.6042 - accuracy: 0.5226 - val_loss: 1.3184 - val_accuracy: 0.6220 Epoch 9/30 566/566 [==============================] - 66s 115ms/step - loss: 1.5013 - accuracy: 0.5514 - val_loss: 3.1717 - val_accuracy: 0.2877 Epoch 10/30 566/566 [==============================] - 81s 143ms/step - loss: 1.4065 - accuracy: 0.5861 - val_loss: 1.6839 - val_accuracy: 0.4673 Epoch 11/30 566/566 [==============================] - 32s 56ms/step - loss: 1.3091 - accuracy: 0.6145 - val_loss: 1.2844 - val_accuracy: 0.6083 Epoch 12/30 566/566 [==============================] - 20s 36ms/step - loss: 1.2411 - accuracy: 0.6433 - val_loss: 1.9822 - val_accuracy: 0.4197 Epoch 13/30 566/566 [==============================] - 20s 35ms/step - loss: 1.1579 - accuracy: 0.6735 - val_loss: 2.9148 - val_accuracy: 0.2677 Epoch 14/30 566/566 [==============================] - 20s 35ms/step - loss: 1.0873 - accuracy: 0.6961 - val_loss: 2.0112 - val_accuracy: 0.4033 Epoch 15/30 566/566 [==============================] - 20s 36ms/step - loss: 1.0162 - accuracy: 0.7230 - val_loss: 6.3398 - val_accuracy: 0.1673 Epoch 16/30 566/566 [==============================] - 21s 37ms/step - loss: 0.9785 - accuracy: 0.7303 - val_loss: 0.8198 - val_accuracy: 0.7947 Epoch 17/30 566/566 [==============================] - 20s 35ms/step - loss: 0.9119 - accuracy: 0.7553 - val_loss: 1.7595 - val_accuracy: 0.5460 Epoch 18/30 566/566 [==============================] - 20s 34ms/step - loss: 0.8577 - accuracy: 0.7697 - val_loss: 0.7021 - val_accuracy: 0.8150 Epoch 19/30 566/566 [==============================] - 20s 35ms/step - loss: 0.8078 - accuracy: 0.7892 - val_loss: 0.9684 - val_accuracy: 0.7237 Epoch 20/30 566/566 [==============================] - 26s 45ms/step - loss: 0.7776 - accuracy: 0.7983 - val_loss: 1.4557 - val_accuracy: 0.5697 Epoch 21/30 566/566 [==============================] - 23s 40ms/step - loss: 0.7449 - accuracy: 0.8084 - val_loss: 0.7774 - val_accuracy: 0.7750 Epoch 22/30 566/566 [==============================] - 21s 37ms/step - loss: 0.7204 - accuracy: 0.8162 - val_loss: 1.1335 - val_accuracy: 0.6960 Epoch 23/30 566/566 [==============================] - 21s 37ms/step - loss: 0.6980 - accuracy: 0.8253 - val_loss: 0.8084 - val_accuracy: 0.7603 Epoch 24/30 566/566 [==============================] - 24s 42ms/step - loss: 0.6471 - accuracy: 0.8382 - val_loss: 1.0631 - val_accuracy: 0.6927 Epoch 25/30 566/566 [==============================] - 32s 56ms/step - loss: 0.6340 - accuracy: 0.8438 - val_loss: 1.3910 - val_accuracy: 0.5797 Epoch 26/30 566/566 [==============================] - 50s 89ms/step - loss: 0.6027 - accuracy: 0.8549 - val_loss: 1.2592 - val_accuracy: 0.6340 Epoch 27/30 566/566 [==============================] - 25s 45ms/step - loss: 0.5801 - accuracy: 0.8620 - val_loss: 0.7457 - val_accuracy: 0.7973 Epoch 28/30 566/566 [==============================] - 20s 36ms/step - loss: 0.5597 - accuracy: 0.8675 - val_loss: 0.4748 - val_accuracy: 0.8887 Epoch 29/30 566/566 [==============================] - 21s 37ms/step - loss: 0.5435 - accuracy: 0.8721 - val_loss: 0.5884 - val_accuracy: 0.8463 Epoch 30/30 566/566 [==============================] - 55s 97ms/step - loss: 0.5310 - accuracy: 0.8763 - val_loss: 0.6686 - val_accuracy: 0.8090
Loss on Test Dataset: 0.6769% Accuracy on Test Dataset: 80.87% CNN Error on Test Dataset: 19.13%
2. RMSprop
complex_model_128_3_l2_RMSprop = Sequential()
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(32, (3,3), input_shape=(128, 128, 1), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(32, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_RMSprop.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(64, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(64, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_RMSprop.add(Dropout(0.4))
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(128, (3,3), activation='relu'))
# Convolutional Layer
complex_model_128_3_l2_RMSprop.add(Conv2D(128, (3,3), activation='relu'))
# Pooling layer
complex_model_128_3_l2_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_128_3_l2_RMSprop.add(Dropout(0.4))
# Flatten the feature map
complex_model_128_3_l2_RMSprop.add(Flatten())
# Fully connected layer
complex_model_128_3_l2_RMSprop.add(Dense(128, activation='relu'))
# Batch Normalization layer
complex_model_128_3_l2_RMSprop.add(BatchNormalization())
# Dropout layer
complex_model_128_3_l2_RMSprop.add(Dropout(0.4))
# Outout layer
complex_model_128_3_l2_RMSprop.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
# Compile basedline_model_128
complex_model_128_3_l2_RMSprop.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])
# print the basedline_model_128 summary
complex_model_128_3_l2_RMSprop.summary()
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_38 (Conv2D) (None, 126, 126, 32) 320
conv2d_39 (Conv2D) (None, 124, 124, 32) 9248
max_pooling2d_20 (MaxPoolin (None, 62, 62, 32) 0
g2D)
dropout_25 (Dropout) (None, 62, 62, 32) 0
conv2d_40 (Conv2D) (None, 60, 60, 64) 18496
conv2d_41 (Conv2D) (None, 58, 58, 64) 36928
max_pooling2d_21 (MaxPoolin (None, 29, 29, 64) 0
g2D)
dropout_26 (Dropout) (None, 29, 29, 64) 0
conv2d_42 (Conv2D) (None, 27, 27, 128) 73856
conv2d_43 (Conv2D) (None, 25, 25, 128) 147584
max_pooling2d_22 (MaxPoolin (None, 12, 12, 128) 0
g2D)
dropout_27 (Dropout) (None, 12, 12, 128) 0
flatten_7 (Flatten) (None, 18432) 0
dense_14 (Dense) (None, 128) 2359424
batch_normalization_6 (Batc (None, 128) 512
hNormalization)
dropout_28 (Dropout) (None, 128) 0
dense_15 (Dense) (None, 15) 1935
=================================================================
Total params: 2,648,303
Trainable params: 2,648,047
Non-trainable params: 256
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_128_3_l2_RMSprop = complex_model_128_3_l2_RMSprop.fit(train_ds_128_dataAug_rescaled, validation_data = validation_ds_128_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_128_3_l2_RMSprop.save_weights('complex_model_128_3_l2_RMSprop_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_128_3_l2_RMSprop)
# Final evaluation of model
scores = complex_model_128_3_l2_RMSprop.evaluate(test_ds_128_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 33s 44ms/step - loss: 2.3277 - accuracy: 0.2880 - val_loss: 2.0907 - val_accuracy: 0.3513 Epoch 2/30 566/566 [==============================] - 21s 37ms/step - loss: 1.6620 - accuracy: 0.4824 - val_loss: 2.5262 - val_accuracy: 0.2817 Epoch 3/30 566/566 [==============================] - 18s 31ms/step - loss: 1.3595 - accuracy: 0.5903 - val_loss: 1.9372 - val_accuracy: 0.4583 Epoch 4/30 566/566 [==============================] - 15s 25ms/step - loss: 1.1428 - accuracy: 0.6634 - val_loss: 0.9575 - val_accuracy: 0.7050 Epoch 5/30 566/566 [==============================] - 15s 25ms/step - loss: 0.9473 - accuracy: 0.7259 - val_loss: 1.0391 - val_accuracy: 0.6977 Epoch 6/30 566/566 [==============================] - 15s 25ms/step - loss: 0.7937 - accuracy: 0.7774 - val_loss: 0.6227 - val_accuracy: 0.8240 Epoch 7/30 566/566 [==============================] - 14s 25ms/step - loss: 0.6865 - accuracy: 0.8113 - val_loss: 1.0153 - val_accuracy: 0.7310 Epoch 8/30 566/566 [==============================] - 14s 25ms/step - loss: 0.6138 - accuracy: 0.8365 - val_loss: 0.4776 - val_accuracy: 0.8773 Epoch 9/30 566/566 [==============================] - 14s 25ms/step - loss: 0.5422 - accuracy: 0.8561 - val_loss: 0.3654 - val_accuracy: 0.9127 Epoch 10/30 566/566 [==============================] - 14s 25ms/step - loss: 0.4953 - accuracy: 0.8715 - val_loss: 0.5664 - val_accuracy: 0.8380 Epoch 11/30 566/566 [==============================] - 14s 25ms/step - loss: 0.4590 - accuracy: 0.8800 - val_loss: 0.3844 - val_accuracy: 0.8953 Epoch 12/30 566/566 [==============================] - 14s 24ms/step - loss: 0.4166 - accuracy: 0.8902 - val_loss: 1.6069 - val_accuracy: 0.5847 Epoch 13/30 566/566 [==============================] - 14s 24ms/step - loss: 0.3922 - accuracy: 0.8978 - val_loss: 0.4366 - val_accuracy: 0.8807 Epoch 14/30 566/566 [==============================] - 14s 25ms/step - loss: 0.3610 - accuracy: 0.9126 - val_loss: 0.7995 - val_accuracy: 0.7720 Epoch 15/30 566/566 [==============================] - 14s 25ms/step - loss: 0.3255 - accuracy: 0.9179 - val_loss: 1.1625 - val_accuracy: 0.7243 Epoch 16/30 566/566 [==============================] - 14s 25ms/step - loss: 0.3086 - accuracy: 0.9211 - val_loss: 0.3776 - val_accuracy: 0.8943 Epoch 17/30 566/566 [==============================] - 14s 25ms/step - loss: 0.2952 - accuracy: 0.9259 - val_loss: 0.4020 - val_accuracy: 0.8770 Epoch 18/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2801 - accuracy: 0.9298 - val_loss: 0.2576 - val_accuracy: 0.9330 Epoch 19/30 566/566 [==============================] - 14s 25ms/step - loss: 0.2727 - accuracy: 0.9310 - val_loss: 0.4901 - val_accuracy: 0.8520 Epoch 20/30 566/566 [==============================] - 14s 25ms/step - loss: 0.2566 - accuracy: 0.9343 - val_loss: 0.3702 - val_accuracy: 0.8887 Epoch 21/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2446 - accuracy: 0.9379 - val_loss: 1.1856 - val_accuracy: 0.6593 Epoch 22/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2255 - accuracy: 0.9432 - val_loss: 0.3322 - val_accuracy: 0.9097 Epoch 23/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2261 - accuracy: 0.9420 - val_loss: 0.1796 - val_accuracy: 0.9570 Epoch 24/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2162 - accuracy: 0.9458 - val_loss: 0.5956 - val_accuracy: 0.8097 Epoch 25/30 566/566 [==============================] - 14s 24ms/step - loss: 0.2126 - accuracy: 0.9466 - val_loss: 0.7735 - val_accuracy: 0.7760 Epoch 26/30 566/566 [==============================] - 14s 25ms/step - loss: 0.2072 - accuracy: 0.9477 - val_loss: 0.1126 - val_accuracy: 0.9767 Epoch 27/30 566/566 [==============================] - 14s 25ms/step - loss: 0.1903 - accuracy: 0.9523 - val_loss: 0.2158 - val_accuracy: 0.9463 Epoch 28/30 566/566 [==============================] - 14s 25ms/step - loss: 0.1831 - accuracy: 0.9535 - val_loss: 0.1550 - val_accuracy: 0.9607 Epoch 29/30 566/566 [==============================] - 14s 24ms/step - loss: 0.1835 - accuracy: 0.9546 - val_loss: 0.4244 - val_accuracy: 0.8827 Epoch 30/30 566/566 [==============================] - 14s 24ms/step - loss: 0.1738 - accuracy: 0.9561 - val_loss: 0.2662 - val_accuracy: 0.9213
Loss on Test Dataset: 0.2576% Accuracy on Test Dataset: 92.73% CNN Error on Test Dataset: 7.27%
Insights:
- It appears the adam is the optimal optimizer for this model.
- I will stick to adam as the optimizer.
B. Input size: 31 by 31 pixels
1. SGD
# Create model
complex_model_31_3_l1_SGD = Sequential()
# Input layer
complex_model_31_3_l1_SGD.add(Conv2D(32, (3, 3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_3_l1_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_3_l1_SGD.add(Dropout(0.2))
# Convolutional layer
complex_model_31_3_l1_SGD.add(Conv2D(64, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Batch Normalization
complex_model_31_3_l1_SGD.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1_SGD.add(Dropout(0.3))
# Convolutional layer
complex_model_31_3_l1_SGD.add(Conv2D(128, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1_SGD.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout Layer
complex_model_31_3_l1_SGD.add(Dropout(0.2))
# Flatten Feature Map
complex_model_31_3_l1_SGD.add(Flatten())
# Fully Connected Dense Layer
complex_model_31_3_l1_SGD.add(Dense(128, activation='relu'))
# Batch Normalization
complex_model_31_3_l1_SGD.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1_SGD.add(Dropout(0.2))
# Fully Connected Dense Layer
complex_model_31_3_l1_SGD.add(Dense(64, activation='relu'))
# Output layer
complex_model_31_3_l1_SGD.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
# Compile complex_model_31_3_l1_SGD
complex_model_31_3_l1_SGD.compile(loss='categorical_crossentropy', optimizer='SGD', metrics=['accuracy'])
# complex_model_31_3_l1_SGD summary
complex_model_31_3_l1_SGD.summary()
Model: "sequential_35"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_113 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_80 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_89 (Dropout) (None, 14, 14, 32) 0
conv2d_114 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_81 (MaxPoolin (None, 6, 6, 64) 0
g2D)
batch_normalization_32 (Bat (None, 6, 6, 64) 256
chNormalization)
dropout_90 (Dropout) (None, 6, 6, 64) 0
conv2d_115 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_82 (MaxPoolin (None, 2, 2, 128) 0
g2D)
dropout_91 (Dropout) (None, 2, 2, 128) 0
flatten_29 (Flatten) (None, 512) 0
dense_68 (Dense) (None, 128) 65664
batch_normalization_33 (Bat (None, 128) 512
chNormalization)
dropout_92 (Dropout) (None, 128) 0
dense_69 (Dense) (None, 64) 8256
dense_70 (Dense) (None, 15) 975
=================================================================
Total params: 168,335
Trainable params: 167,951
Non-trainable params: 384
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_3_l1_SGD = complex_model_31_3_l1_SGD.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_3_l1_SGD.save_weights('complex_model_31_3_l1_SGD_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_3_l1_SGD)
# Final evaluation of model
scores = complex_model_31_3_l1_SGD.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 10s 17ms/step - loss: 3.5758 - accuracy: 0.1803 - val_loss: 3.5372 - val_accuracy: 0.1100 Epoch 2/30 566/566 [==============================] - 9s 16ms/step - loss: 2.8657 - accuracy: 0.2602 - val_loss: 3.0252 - val_accuracy: 0.1667 Epoch 3/30 566/566 [==============================] - 9s 16ms/step - loss: 2.4943 - accuracy: 0.3018 - val_loss: 2.9160 - val_accuracy: 0.1827 Epoch 4/30 566/566 [==============================] - 9s 16ms/step - loss: 2.3122 - accuracy: 0.3240 - val_loss: 2.7399 - val_accuracy: 0.2093 Epoch 5/30 566/566 [==============================] - 9s 16ms/step - loss: 2.2179 - accuracy: 0.3406 - val_loss: 2.4555 - val_accuracy: 0.2813 Epoch 6/30 566/566 [==============================] - 9s 16ms/step - loss: 2.1510 - accuracy: 0.3471 - val_loss: 2.3253 - val_accuracy: 0.2807 Epoch 7/30 566/566 [==============================] - 9s 16ms/step - loss: 2.0935 - accuracy: 0.3656 - val_loss: 3.6947 - val_accuracy: 0.1393 Epoch 8/30 566/566 [==============================] - 9s 15ms/step - loss: 2.0525 - accuracy: 0.3754 - val_loss: 2.4942 - val_accuracy: 0.2580 Epoch 9/30 566/566 [==============================] - 8s 15ms/step - loss: 2.0065 - accuracy: 0.3929 - val_loss: 2.5028 - val_accuracy: 0.2750 Epoch 10/30 566/566 [==============================] - 9s 16ms/step - loss: 1.9687 - accuracy: 0.3972 - val_loss: 2.2043 - val_accuracy: 0.3403 Epoch 11/30 566/566 [==============================] - 9s 16ms/step - loss: 1.9217 - accuracy: 0.4152 - val_loss: 3.9492 - val_accuracy: 0.1470 Epoch 12/30 566/566 [==============================] - 9s 16ms/step - loss: 1.8905 - accuracy: 0.4245 - val_loss: 2.4320 - val_accuracy: 0.2990 Epoch 13/30 566/566 [==============================] - 9s 16ms/step - loss: 1.8509 - accuracy: 0.4402 - val_loss: 2.5448 - val_accuracy: 0.2897 Epoch 14/30 566/566 [==============================] - 9s 16ms/step - loss: 1.8166 - accuracy: 0.4503 - val_loss: 2.7866 - val_accuracy: 0.2583 Epoch 15/30 566/566 [==============================] - 9s 15ms/step - loss: 1.7807 - accuracy: 0.4628 - val_loss: 3.6068 - val_accuracy: 0.2460 Epoch 16/30 566/566 [==============================] - 9s 16ms/step - loss: 1.7554 - accuracy: 0.4747 - val_loss: 3.1680 - val_accuracy: 0.1730 Epoch 17/30 566/566 [==============================] - 9s 16ms/step - loss: 1.7463 - accuracy: 0.4765 - val_loss: 3.1721 - val_accuracy: 0.2233 Epoch 18/30 566/566 [==============================] - 9s 16ms/step - loss: 1.7078 - accuracy: 0.4824 - val_loss: 2.5862 - val_accuracy: 0.3113 Epoch 19/30 566/566 [==============================] - 8s 14ms/step - loss: 1.6790 - accuracy: 0.4948 - val_loss: 4.0371 - val_accuracy: 0.1757 Epoch 20/30 566/566 [==============================] - 9s 16ms/step - loss: 1.6502 - accuracy: 0.5094 - val_loss: 1.6571 - val_accuracy: 0.5173 Epoch 21/30 566/566 [==============================] - 9s 16ms/step - loss: 1.6284 - accuracy: 0.5129 - val_loss: 1.8519 - val_accuracy: 0.4390 Epoch 22/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5958 - accuracy: 0.5266 - val_loss: 3.1363 - val_accuracy: 0.2500 Epoch 23/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5780 - accuracy: 0.5322 - val_loss: 2.8243 - val_accuracy: 0.2910 Epoch 24/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5478 - accuracy: 0.5406 - val_loss: 4.6573 - val_accuracy: 0.1977 Epoch 25/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5276 - accuracy: 0.5500 - val_loss: 2.0766 - val_accuracy: 0.4017 Epoch 26/30 566/566 [==============================] - 9s 16ms/step - loss: 1.5185 - accuracy: 0.5524 - val_loss: 1.4910 - val_accuracy: 0.5523 Epoch 27/30 566/566 [==============================] - 9s 16ms/step - loss: 1.4913 - accuracy: 0.5554 - val_loss: 3.6806 - val_accuracy: 0.2647 Epoch 28/30 566/566 [==============================] - 9s 16ms/step - loss: 1.4758 - accuracy: 0.5692 - val_loss: 1.7641 - val_accuracy: 0.4740 Epoch 29/30 566/566 [==============================] - 8s 15ms/step - loss: 1.4476 - accuracy: 0.5805 - val_loss: 2.0668 - val_accuracy: 0.3920 Epoch 30/30 566/566 [==============================] - 9s 16ms/step - loss: 1.4489 - accuracy: 0.5795 - val_loss: 2.0732 - val_accuracy: 0.4133
Loss on Test Dataset: 2.0769% Accuracy on Test Dataset: 41.20% CNN Error on Test Dataset: 58.80%
2. RMSprop
# Create model
complex_model_31_3_l1_RMSprop = Sequential()
# Input layer
complex_model_31_3_l1_RMSprop.add(Conv2D(32, (3, 3), input_shape=(31, 31, 1), activation='relu'))
# Pooling layer
complex_model_31_3_l1_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout layer
complex_model_31_3_l1_RMSprop.add(Dropout(0.2))
# Convolutional layer
complex_model_31_3_l1_RMSprop.add(Conv2D(64, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Batch Normalization
complex_model_31_3_l1_RMSprop.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1_RMSprop.add(Dropout(0.3))
# Convolutional layer
complex_model_31_3_l1_RMSprop.add(Conv2D(128, (3, 3), activation='relu'))
# Pooling layer
complex_model_31_3_l1_RMSprop.add(MaxPooling2D(pool_size=(2, 2)))
# Dropout Layer
complex_model_31_3_l1_RMSprop.add(Dropout(0.2))
# Flatten Feature Map
complex_model_31_3_l1_RMSprop.add(Flatten())
# Fully Connected Dense Layer
complex_model_31_3_l1_RMSprop.add(Dense(128, activation='relu'))
# Batch Normalization
complex_model_31_3_l1_RMSprop.add(BatchNormalization())
# Dropout layer
complex_model_31_3_l1_RMSprop.add(Dropout(0.2))
# Fully Connected Dense Layer
complex_model_31_3_l1_RMSprop.add(Dense(64, activation='relu'))
# Output layer
complex_model_31_3_l1_RMSprop.add(Dense(num_classes, activation='softmax', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
# Compile complex_model_31_3_l1_RMSprop
complex_model_31_3_l1_RMSprop.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])
# complex_model_31_3_l1_RMSprop summary
complex_model_31_3_l1_RMSprop.summary()
Model: "sequential_36"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_116 (Conv2D) (None, 29, 29, 32) 320
max_pooling2d_83 (MaxPoolin (None, 14, 14, 32) 0
g2D)
dropout_93 (Dropout) (None, 14, 14, 32) 0
conv2d_117 (Conv2D) (None, 12, 12, 64) 18496
max_pooling2d_84 (MaxPoolin (None, 6, 6, 64) 0
g2D)
batch_normalization_34 (Bat (None, 6, 6, 64) 256
chNormalization)
dropout_94 (Dropout) (None, 6, 6, 64) 0
conv2d_118 (Conv2D) (None, 4, 4, 128) 73856
max_pooling2d_85 (MaxPoolin (None, 2, 2, 128) 0
g2D)
dropout_95 (Dropout) (None, 2, 2, 128) 0
flatten_30 (Flatten) (None, 512) 0
dense_71 (Dense) (None, 128) 65664
batch_normalization_35 (Bat (None, 128) 512
chNormalization)
dropout_96 (Dropout) (None, 128) 0
dense_72 (Dense) (None, 64) 8256
dense_73 (Dense) (None, 15) 975
=================================================================
Total params: 168,335
Trainable params: 167,951
Non-trainable params: 384
_________________________________________________________________
# Fit model with rescaled & data augmented train dataset
history_complex_model_31_3_l1_RMSprop = complex_model_31_3_l1_RMSprop.fit(train_ds_31_dataAug_rescaled, validation_data = validation_ds_31_rescaled, epochs = 30, verbose = 1)
# Save model weights
complex_model_31_3_l1_RMSprop.save_weights('complex_model_31_3_l1_RMSprop_weights.h5')
# Summarize history for accuracy & loss
plotCompareMetrics(history_complex_model_31_3_l1_RMSprop)
# Final evaluation of model
scores = complex_model_31_3_l1_RMSprop.evaluate(test_ds_31_rescaled, verbose = 0)
print("Loss on Test Dataset: %.4f%%" % (scores[0]))
print("Accuracy on Test Dataset: %.2f%%" % (scores[1]*100))
print("CNN Error on Test Dataset: %.2f%%" % (100 - scores[1]*100))
Epoch 1/30 566/566 [==============================] - 13s 22ms/step - loss: 2.8488 - accuracy: 0.3046 - val_loss: 4.1328 - val_accuracy: 0.1343 Epoch 2/30 566/566 [==============================] - 13s 22ms/step - loss: 2.0210 - accuracy: 0.4268 - val_loss: 1.8131 - val_accuracy: 0.4623 Epoch 3/30 566/566 [==============================] - 12s 22ms/step - loss: 1.7680 - accuracy: 0.4954 - val_loss: 1.8614 - val_accuracy: 0.4670 Epoch 4/30 566/566 [==============================] - 12s 22ms/step - loss: 1.6290 - accuracy: 0.5337 - val_loss: 1.7904 - val_accuracy: 0.5040 Epoch 5/30 566/566 [==============================] - 13s 22ms/step - loss: 1.5215 - accuracy: 0.5670 - val_loss: 2.7096 - val_accuracy: 0.3207 Epoch 6/30 566/566 [==============================] - 13s 22ms/step - loss: 1.4421 - accuracy: 0.5953 - val_loss: 2.0515 - val_accuracy: 0.4377 Epoch 7/30 566/566 [==============================] - 11s 20ms/step - loss: 1.3734 - accuracy: 0.6200 - val_loss: 3.9379 - val_accuracy: 0.2673 Epoch 8/30 566/566 [==============================] - 12s 22ms/step - loss: 1.3276 - accuracy: 0.6301 - val_loss: 1.6018 - val_accuracy: 0.5687 Epoch 9/30 566/566 [==============================] - 12s 22ms/step - loss: 1.2848 - accuracy: 0.6433 - val_loss: 1.0317 - val_accuracy: 0.7227 Epoch 10/30 566/566 [==============================] - 12s 22ms/step - loss: 1.2208 - accuracy: 0.6638 - val_loss: 0.9469 - val_accuracy: 0.7583 Epoch 11/30 566/566 [==============================] - 12s 22ms/step - loss: 1.2058 - accuracy: 0.6715 - val_loss: 1.4766 - val_accuracy: 0.6113 Epoch 12/30 566/566 [==============================] - 12s 22ms/step - loss: 1.1603 - accuracy: 0.6835 - val_loss: 0.8766 - val_accuracy: 0.7803 Epoch 13/30 566/566 [==============================] - 12s 22ms/step - loss: 1.1369 - accuracy: 0.6871 - val_loss: 0.8082 - val_accuracy: 0.8013 Epoch 14/30 566/566 [==============================] - 12s 21ms/step - loss: 1.1103 - accuracy: 0.6977 - val_loss: 1.1794 - val_accuracy: 0.6787 Epoch 15/30 566/566 [==============================] - 12s 21ms/step - loss: 1.0827 - accuracy: 0.7093 - val_loss: 1.6141 - val_accuracy: 0.5620 Epoch 16/30 566/566 [==============================] - 12s 22ms/step - loss: 1.0633 - accuracy: 0.7154 - val_loss: 0.8495 - val_accuracy: 0.7860 Epoch 17/30 566/566 [==============================] - 13s 22ms/step - loss: 1.0440 - accuracy: 0.7201 - val_loss: 0.7923 - val_accuracy: 0.7967 Epoch 18/30 566/566 [==============================] - 12s 22ms/step - loss: 1.0472 - accuracy: 0.7191 - val_loss: 1.3680 - val_accuracy: 0.6313 Epoch 19/30 566/566 [==============================] - 12s 22ms/step - loss: 1.0135 - accuracy: 0.7315 - val_loss: 0.7122 - val_accuracy: 0.8243 Epoch 20/30 566/566 [==============================] - 12s 22ms/step - loss: 1.0132 - accuracy: 0.7287 - val_loss: 0.7586 - val_accuracy: 0.8073 Epoch 21/30 566/566 [==============================] - 12s 22ms/step - loss: 0.9894 - accuracy: 0.7329 - val_loss: 0.5864 - val_accuracy: 0.8697 Epoch 22/30 566/566 [==============================] - 11s 19ms/step - loss: 0.9850 - accuracy: 0.7368 - val_loss: 0.9186 - val_accuracy: 0.7543 Epoch 23/30 566/566 [==============================] - 12s 22ms/step - loss: 0.9475 - accuracy: 0.7513 - val_loss: 3.2062 - val_accuracy: 0.3783 Epoch 24/30 566/566 [==============================] - 12s 22ms/step - loss: 0.9494 - accuracy: 0.7466 - val_loss: 0.8951 - val_accuracy: 0.7557 Epoch 25/30 566/566 [==============================] - 12s 22ms/step - loss: 0.9430 - accuracy: 0.7514 - val_loss: 1.9005 - val_accuracy: 0.5310 Epoch 26/30 566/566 [==============================] - 13s 22ms/step - loss: 0.9312 - accuracy: 0.7540 - val_loss: 0.7661 - val_accuracy: 0.8010 Epoch 27/30 566/566 [==============================] - 13s 22ms/step - loss: 0.9115 - accuracy: 0.7599 - val_loss: 1.8628 - val_accuracy: 0.5197 Epoch 28/30 566/566 [==============================] - 13s 22ms/step - loss: 0.9237 - accuracy: 0.7550 - val_loss: 0.8218 - val_accuracy: 0.7910 Epoch 29/30 566/566 [==============================] - 12s 21ms/step - loss: 0.9042 - accuracy: 0.7643 - val_loss: 0.6165 - val_accuracy: 0.8463 Epoch 30/30 566/566 [==============================] - 12s 21ms/step - loss: 0.8770 - accuracy: 0.7712 - val_loss: 0.6851 - val_accuracy: 0.8383
Loss on Test Dataset: 0.6771% Accuracy on Test Dataset: 83.73% CNN Error on Test Dataset: 16.27%
Hence, I will stick to adam for the optimizer
6. Model Evaluation
A. Input size: 128 by 128 pixels
tf.keras.utils.plot_model(complex_model_128_3_l2, show_shapes=True)
# Load weights
complex_model_128_3_l2.load_weights('cnn_complex_model_128_3_l2_best_weights.h5')
# Make Predictions on Test Dataset
y_trueLabels_128, y_preds_128, trueClass_128, FalseClass_128 = makePredictions(complex_model_128_3_l2, test_ds_128_rescaled, class_names)
# Plot Confusion Matrix
plot_confusion_matrix(y_trueLabels_128, y_preds_128, class_names)
94/94 [==============================] - 1s 16ms/step
Most Labels Getting Mixed Up
- Capsicum:
- 12 wrongly classified as Tomato, Papaya, Bean, Brinjal, Bottle_Groud
- Tomato:
- 11 wrongly classified as Radish, Potato, Cabbage, Bean, Brinjal
- Capsicum:
Otherwise, the model is able to predict most labels accurately with less than 10 samples wrongly classified
B. Input size: 31 by 31 pixels
tf.keras.utils.plot_model(complex_model_31_3_l1, show_shapes=True)
# load weights
complex_model_31_3_l1.load_weights('cnn_complex_model_31_3_l1_best_weights.h5')
# Make Predictions on Test Dataset
y_trueLabels_31, y_preds_31, trueClass_31, FalseClass_31 = makePredictions(complex_model_31_3_l1, test_ds_31_rescaled, class_names)
# Plot Confusion Matrix
plot_confusion_matrix(y_trueLabels_31, y_preds_31, class_names)
94/94 [==============================] - 0s 2ms/step
Most Labels Getting Mixed Up
- Cabbage:
- 25 wrongly classified as Cauliflower, Pumpkin, Broccoli
- Radish:
- 11 wrongly classified as Radish, Potato, Cabbage, Bean, Brinjal
- Cabbage:
Otherwise, the model is able to predict most labels accurately
Compared to 128x128, 31x31 has more errors in identifying the labels
7. Error Analysis
A. Input size: 128 by 128 pixels
plot_prediction(FalseClass_128)
Insights:
- The model gets confused between Capsicum/Tomato due to their similar circular shape and Carrot/Potato/Papaya due to similar oval shape
- The model fails to correctly classifiy the vegetable as the test images are stretched or taken form an awkward angle as seen in the tomato images
B. Input size: 31 by 31 pixels
plot_prediction(FalseClass_31)
Insights:
Similarly, the model gets confused between vegetables such as Brinjal/Tomato due to their similar circular shape
However, there are instances whereby the vegetables have different shapes but the model still wrongly classifies it.
- Bitter Gourd has an oval shape while Tomato has a circular shape
At 31 x 31 pixels, the shapes are harder to be captured by the model.